At some point we need actual consequences for sites that intentionally hide their tracking. It should be criminal. It is stalking and has real world consequences. Just because an exploit exists doesn't mean it should be used. That logic is like saying it is OK to break into a house because the lock on the door was weak. If we don't get real protections, at what point does it become justified to go offensive against sites that exploit things like this? If I found someone putting trackers on me with the intent to sell that information (harm me) I would defend myself. When am I allowed to do that in the digital world?
Quick side note here. I appreciate the research calling this out. We need to know the dangers out there to figure out how to protect ourselves, especially since governments don't seem to take this seriously.
I think two things keep the status quo where the end-user is exploited and attacked constantly. The first is the VC / Startup model. Because VC is the true customer, and not the end-user. The second is the current marketing and advertising model. Can it keep working well enough to be worth the money? When it's not, the bottom falls out.
Old business model: solve a problem for your customer, add some value, take home a cut.
Current business model: solve investment return for your investors, get the returns by addicting your end-user to something they don't need.
Future business model: ?
> The first is the VC / Startup model. Because VC is the true customer, and not the end-user.
I don't see how that's related? Anyone looking to increase their revenue looks at tracking. Even I, with my popular open source projects, receive emails to add tracking, let alone business that need money to pay their employees.
I am not concerned with bieng tracked, and assume that large entities on the net have the ability to track and find anything or anybody, ho hum, but my simple personal requirement is not to be then sold to petty merchants and harassed in my own home with adds and fake "personalisations", and offered unasked for "help", so I watch closely, and go to any length to disable adds, or "fingerprinting", "profiling", or whatever.
The net is horrible, I need socks, but as I am now sensitised to bieng tracked and followed, I will just get socks at the hardware store, rather then try and track down what were mentioned as perfect travellers socks and other gear, because the
mountain of equipment relentlessly devoted to selling me anything from the waist down herafter is impossible to contemplate, and I now only use search for items required for my business, but am often forced to give up, as the vast majority of the web has been co-opted by major retailers.
Even though I have never been on social media, have no accounts with any of the retailers, people are telling me that they found me and my business through an LLM, of some flavor, and/or were convinced of my abilities from my "5 star ratings", I am too busy currently to unravel, exactly how the data is put together and then used, but quite clearly, there is no way to use the net(however "lightly"), and not be swollowed up and commodified.
Umm...But it is criminal. The GDPR, at least, doesn't care how you track users - whether through cookies, local storage, favicon or whatever other mechanism you've developed. If you track users you must follow certain rules, and if don't, you will be facing fines if/when you're caught.
If you visit my eg. physical clothing store I'm allowed to monitor your in-shop behavior to better optimize my store for your needs. Same for a restaurant etc. That's how _you_ get _much improved services_ and I get _happier customers_.
Ofc I'm not allowed to freaking resell that data. THIS is the problem in online: releseling and data-brokers. Just KILL these categories of businesses off completely and make _them_ criminal (like even give f prison sentences to their operators).
We should get back to our sanity in ONLINE. As long as you're on _my (online) property_ and using _my services_ I can of course see EVERYTHING you f do, and should stop pretending I don't (as a business, ofc - anonymization exists and not any random employee can access any customer's data, probably should never access both data and identity correlated unless they're actively investigating some serious fraud). As long as I'm not sharing this data with anyone else, I should be 100% allowed to use every drop of this data to improve my services to you and totally differentiate myself from the incompetent competition that can't properly do this.
Data privacy (from EU's GDPR to... everything else) only helps big corporations fend-off competition from small startups or boutique shops that could easily out-compete them by offering hyper-personalized hand tailored micro-optimized experiences for their smaller number of customers based on the loads of data they collect from them. In the EU I've only ever seen these kinds of laws severely hamper small boutique or family businesses that wanted to hyperpersonalize to everyone's gain while big corpos easily surf around them with their teams of lawyers.
...we've all been brainwashed by this privacy psyop to sheepishly "fight for our privacy" in ways that are detrimental to us and only help our corporate oligarch overlords maintain an even tighter grip on power, while offering us worse and worse services. Wake the f up, DATA IS MEANT TO BE USED to IMPROVE goods and services, not remain uncollected or sit unused!
> As long as you're on _my (online) property_ and using _my services_ I can of course see EVERYTHING you f do
That's fine, but you are not allowed to send me malware, that runs on _my property_ and snoops on _my data_.
Also data doesn't stop being mine, just because you have it. You also can't take photographs of random people and claim this is yours now. That's an important difference between the USA and European countries.
+ as a bonus we'd also incentivize businesses to internalize their marketing and related tech operations (since sharing data with 3rd parties would not be allowed), same for AI-customizations etc., forcing them to tech-ify and become more tech-savy businesses instead of externalizing all such things to evil big tech (eg. a clothing store chain could compete not only by producing better clothes, but also by developing better monitoring and generative AI for human-in-the-loop hyperpersonalization, spreading tech out... instead of outsourcing these to tech or big-consulting companies as they do now when the too-little-data they so collect anyhow is otherwise easily share-able to third parties)
My hacker news icon has been stuck as the icon for a weather site that I sometimes check. It’s been stuck that way for close to a year now, and has survived an iOS update too.
It persists across profiles and into private browsing mode.
I have the same, the Youtube icon is the Hacker news icon, and the other way round. I have to assume this is some sort of race condition, data corruption, or something else, and it's quite widespread too given all these reports.
For me the iOS HN icon changes between the reddit and github, depending on which one I've been using the most on my phone recently. This happens on both iOS Safari and Kagi's Orion.
I thought that this was just a bug in iOS but based on the comments in this thread, it seems to be common not only across OSes but browser vendors too (I assume iOS Orion uses the same engine as Safari)
Where do you think the AI training data comes from?
Emoji-heavy documentation/commit messages always seem very popular in JS projects, as this is seems to be the project of a 12 (Edit: It's 20, misread) year old I'm not too surprised that it's a bit unusual compared to others.
What is the live demo supposed to do? I just get stuck in an endless redirect loop with a counter going from 1 to 18 and then restarting. I’m using Safari on iOS.
Safari on iOS. It goes to 18/18 and then starts over from 1/18 again for me too. I had not pressed any retry button, this happened the first time I visited the page. And I wasn’t even in private browsing mode. Just navigated to it normally.
This is an insightful read. One question I have is, how do you ensure a user visits all of the N routes for the ID to be generate or to be verified on revisits.
Nonpersistent vm-based browser, I use qemu + cage + firefox and some glue logic to fire up a copy of a base image which gets deleted on exit. Fires up slower than a native firefox instance but runs all the same.
Can containerize for the less paranoid and less work but browsers touching host kernel gives me the ick as does the idea of trying to write ebpf policies for firefox to mitigate. Browsers are pain.
Tried a similar approach but found that putting the browser in a VM has a tendency to expose a few data points that stand out as less trust worthy which means you end up getting a lot of captchas on some websites (like using swiftshader for renderer, not having some fonts installed, among other things), lying about these can typically be detected as well (like injecting noise into a canvas, modifying the advertised renderer). If you've found any solutions to these please share.
Popular browsers support tabs. When you have many tabs open, it's hard to
show a meaningful title for each one. An icon takes up less place and is
easier to scan for visually.
Mozilla Firefox doesn't shrink tabs any further, but instead lets the tab list go off screen and you can scroll. I think that is a Google Chrome specific thing.
There is ad money at stake, and it is unfortunately one of the key revenue models in the modern web. I don't know if this particular research was sponsored by ad-tech or if it's preventive, but it shouldn't be generally surprising that this kind of things are heavily researched.
The "supercookie" phenomenon from a few years ago (when this was created) was that despite using private browsing or deleting cookies, the site id remains the same for the same browser.
It depends on how the browser rejects favicons? If the browser reports the icon is already cached, I agree (assuming the reports are indistinguishable). But maybe it just never downloads the icon, for example.
This is great, I needed more tools for tracking bad users who have been banned and try to ban evade. I have been using Samy Kamkars evercookie which is pretty good but some of the techniques are dated.
At some point we need actual consequences for sites that intentionally hide their tracking. It should be criminal. It is stalking and has real world consequences. Just because an exploit exists doesn't mean it should be used. That logic is like saying it is OK to break into a house because the lock on the door was weak. If we don't get real protections, at what point does it become justified to go offensive against sites that exploit things like this? If I found someone putting trackers on me with the intent to sell that information (harm me) I would defend myself. When am I allowed to do that in the digital world?
Quick side note here. I appreciate the research calling this out. We need to know the dangers out there to figure out how to protect ourselves, especially since governments don't seem to take this seriously.
I think two things keep the status quo where the end-user is exploited and attacked constantly. The first is the VC / Startup model. Because VC is the true customer, and not the end-user. The second is the current marketing and advertising model. Can it keep working well enough to be worth the money? When it's not, the bottom falls out.
Old business model: solve a problem for your customer, add some value, take home a cut. Current business model: solve investment return for your investors, get the returns by addicting your end-user to something they don't need. Future business model: ?
> The first is the VC / Startup model. Because VC is the true customer, and not the end-user.
I don't see how that's related? Anyone looking to increase their revenue looks at tracking. Even I, with my popular open source projects, receive emails to add tracking, let alone business that need money to pay their employees.
I am not concerned with bieng tracked, and assume that large entities on the net have the ability to track and find anything or anybody, ho hum, but my simple personal requirement is not to be then sold to petty merchants and harassed in my own home with adds and fake "personalisations", and offered unasked for "help", so I watch closely, and go to any length to disable adds, or "fingerprinting", "profiling", or whatever. The net is horrible, I need socks, but as I am now sensitised to bieng tracked and followed, I will just get socks at the hardware store, rather then try and track down what were mentioned as perfect travellers socks and other gear, because the mountain of equipment relentlessly devoted to selling me anything from the waist down herafter is impossible to contemplate, and I now only use search for items required for my business, but am often forced to give up, as the vast majority of the web has been co-opted by major retailers. Even though I have never been on social media, have no accounts with any of the retailers, people are telling me that they found me and my business through an LLM, of some flavor, and/or were convinced of my abilities from my "5 star ratings", I am too busy currently to unravel, exactly how the data is put together and then used, but quite clearly, there is no way to use the net(however "lightly"), and not be swollowed up and commodified.
Isn't this covered by GDPR?
Umm...But it is criminal. The GDPR, at least, doesn't care how you track users - whether through cookies, local storage, favicon or whatever other mechanism you've developed. If you track users you must follow certain rules, and if don't, you will be facing fines if/when you're caught.
If you visit my eg. physical clothing store I'm allowed to monitor your in-shop behavior to better optimize my store for your needs. Same for a restaurant etc. That's how _you_ get _much improved services_ and I get _happier customers_.
Ofc I'm not allowed to freaking resell that data. THIS is the problem in online: releseling and data-brokers. Just KILL these categories of businesses off completely and make _them_ criminal (like even give f prison sentences to their operators).
We should get back to our sanity in ONLINE. As long as you're on _my (online) property_ and using _my services_ I can of course see EVERYTHING you f do, and should stop pretending I don't (as a business, ofc - anonymization exists and not any random employee can access any customer's data, probably should never access both data and identity correlated unless they're actively investigating some serious fraud). As long as I'm not sharing this data with anyone else, I should be 100% allowed to use every drop of this data to improve my services to you and totally differentiate myself from the incompetent competition that can't properly do this.
Data privacy (from EU's GDPR to... everything else) only helps big corporations fend-off competition from small startups or boutique shops that could easily out-compete them by offering hyper-personalized hand tailored micro-optimized experiences for their smaller number of customers based on the loads of data they collect from them. In the EU I've only ever seen these kinds of laws severely hamper small boutique or family businesses that wanted to hyperpersonalize to everyone's gain while big corpos easily surf around them with their teams of lawyers.
...we've all been brainwashed by this privacy psyop to sheepishly "fight for our privacy" in ways that are detrimental to us and only help our corporate oligarch overlords maintain an even tighter grip on power, while offering us worse and worse services. Wake the f up, DATA IS MEANT TO BE USED to IMPROVE goods and services, not remain uncollected or sit unused!
> As long as you're on _my (online) property_ and using _my services_ I can of course see EVERYTHING you f do
That's fine, but you are not allowed to send me malware, that runs on _my property_ and snoops on _my data_.
Also data doesn't stop being mine, just because you have it. You also can't take photographs of random people and claim this is yours now. That's an important difference between the USA and European countries.
+ as a bonus we'd also incentivize businesses to internalize their marketing and related tech operations (since sharing data with 3rd parties would not be allowed), same for AI-customizations etc., forcing them to tech-ify and become more tech-savy businesses instead of externalizing all such things to evil big tech (eg. a clothing store chain could compete not only by producing better clothes, but also by developing better monitoring and generative AI for human-in-the-loop hyperpersonalization, spreading tech out... instead of outsourcing these to tech or big-consulting companies as they do now when the too-little-data they so collect anyhow is otherwise easily share-able to third parties)
FYI for anyone that cares: this attack vector has been patched by browsers years ago, fairly soon after this was released.
https://github.com/jonasstrehle/supercookie/issues/30
I was sure this has been a thing for a while, either that or safari has a UI bug since forever.
I regularly get the wrong favicon in specific sites, for example ars technica favicon in reddit
My hacker news icon has been stuck as the icon for a weather site that I sometimes check. It’s been stuck that way for close to a year now, and has survived an iOS update too.
It persists across profiles and into private browsing mode.
You guys have favicons? I don't have any in my tabs, but maybe I have turned that of at some point. I'm using Mozilla Firefox.
To me HN has been stuck as Facebooks icon for a really long time.
Could site icons be connected somehow to iCloud?
I have the same, the Youtube icon is the Hacker news icon, and the other way round. I have to assume this is some sort of race condition, data corruption, or something else, and it's quite widespread too given all these reports.
For me the iOS HN icon changes between the reddit and github, depending on which one I've been using the most on my phone recently. This happens on both iOS Safari and Kagi's Orion.
I thought that this was just a bug in iOS but based on the comments in this thread, it seems to be common not only across OSes but browser vendors too (I assume iOS Orion uses the same engine as Safari)
Safari has super long lived favicon caches too. The only way to force a rebuild is to set your system clock forward a few years.
According to the Github page, you can just run `rm ~/Library/Safari/Favicon Cache/*`
yes, but time travel is cooler
I thought I was the only one! Something in the UI cache is so horribly corrupted and it has been for years on my MacBook, I just gave up hope.
I get the same bug in Firefox as well sometimes.
I get the wrong for HN in mobile Chrome
Totally unrelated, but what I found interesting: the README hasn’t been touched for years, yet it looks entirely AI generated. Including the commits.
People actually wrote READMEs / commit messages like that before? Have I been living under a rock?
Where do you think the AI training data comes from?
Emoji-heavy documentation/commit messages always seem very popular in JS projects, as this is seems to be the project of a 12 (Edit: It's 20, misread) year old I'm not too surprised that it's a bit unusual compared to others.
Ah, I didn’t know this was made by a child, that makes sense then.
I knew this was part of the JS community, I just didn’t realize AI was literally 1:1 using the same style.
I guess didn’t realize that the NodeJS community was so dominant.
Or maybe is it because the NodeJS community always had a style of “many small libraries”, which causes them to be over represented?
The README.md says the author was twenty years old when it was written. Am I missing something here?
I read it as twelve. My bad!
What is the live demo supposed to do? I just get stuck in an endless redirect loop with a counter going from 1 to 18 and then restarting. I’m using Safari on iOS.
This was fixed after we reported it a few years ago while working on the paper.
Look at the Github repo:
- The last update was 2 years ago.
- It says that MS Edge 87 is affected. The current Version of Edge is 142.
This is no longer an issue, but it is interesting thinking about how long the NSA knew about this before the general population did.
Android/Firefox it showed me my unique ID after the first 18. Then there was a button to try again ans that put me in the same loop you're having.
Safari on iOS. It goes to 18/18 and then starts over from 1/18 again for me too. I had not pressed any retry button, this happened the first time I visited the page. And I wasn’t even in private browsing mode. Just navigated to it normally.
FireFox for Android private browsing mode gets stuck in the loop 100% for me
Related discussion?
"Tales of Favicons and Caches: Persistent Tracking in Modern Browsers"
https://news.ycombinator.com/item?id=25868742
53 comments on 22-jan-2021
Needs a (2023) addition in the title
make it 2021 actually. After these years, was this fixed?
It was fixed for me on Chrome.
Reminds me I noticed macOS Safari pulling in the favicons somewhat frequently when I load the new tab page with favorites on it.
Definitely something I don't want. Maybe I should just remove the favorites or maybe I can save them as redirects or HTML or something.
Note I use private windows most often & shoutout Little Snitch for driving the discovery.
This is an insightful read. One question I have is, how do you ensure a user visits all of the N routes for the ID to be generate or to be verified on revisits.
Nonpersistent vm-based browser, I use qemu + cage + firefox and some glue logic to fire up a copy of a base image which gets deleted on exit. Fires up slower than a native firefox instance but runs all the same.
Can containerize for the less paranoid and less work but browsers touching host kernel gives me the ick as does the idea of trying to write ebpf policies for firefox to mitigate. Browsers are pain.
Tried a similar approach but found that putting the browser in a VM has a tendency to expose a few data points that stand out as less trust worthy which means you end up getting a lot of captchas on some websites (like using swiftshader for renderer, not having some fonts installed, among other things), lying about these can typically be detected as well (like injecting noise into a canvas, modifying the advertised renderer). If you've found any solutions to these please share.
What approach did you end up going with instead?
This sounds interesting, do you have this written up anywhere?
I sadly do not atm beyond some notes but I can if there is interest.
Previous comments (2021)
https://news.ycombinator.com/item?id=26051370
Thanks!
Supercookie: Browser Fingerprinting via Favicon - https://news.ycombinator.com/item?id=26051370 - Feb 2021 (81 comments)
I use a browser that does not support favicon
Wondering why users of popular browsers believe favicon is needed
(I'm assuming users asked the authors of those browsers for favicon)
Do tabs in the popular graphical browsers display a number on each tab by default
This might be useful when switching from, e.g., tab#1 to tab#7, using keyboard shortcut Ctrl-7
Popular browsers support tabs. When you have many tabs open, it's hard to show a meaningful title for each one. An icon takes up less place and is easier to scan for visually.
Mozilla Firefox doesn't shrink tabs any further, but instead lets the tab list go off screen and you can scroll. I think that is a Google Chrome specific thing.
It's a shame that the actual attack mechanism doesn't seem to be detailed on the github repo, and the link to the article is dead.
Paper author here, here’s a valid link: https://www.cs.uic.edu/~polakis/papers/favicon.pdf
https://supercookie.me/workwise
I just got a refresh per second and a counter from 1/18 to 18/18 and repeat. Feels like I wasted 20s.
Nice to see Brave patched it though.
(2023) per readme.md date
(2021) per https://news.ycombinator.com/item?id=45948731
I got different IDs in regular browsing vs incognito mode in Firefox.
Seems like Firefox made changes to address this kind of tracking in version 85.
Do you happen to know where the bug report is?
I got different IDs in regular browsing vs my first incognito window vs my second incognito window.
The demo didn't work for me. Safari latest ios
I have never liked how Safari always tries to reload favicons. Seems like an obvious and annoying privacy leak.
Why doesn't this apply to any kind of cached content?
I guess that you can do fingerprinting with any cached content, but the insane persistency of favicon's cache makes this much more concerning.
Probably not a popular opinion here but i'm honestly impressed that someone made this work?
There is ad money at stake, and it is unfortunately one of the key revenue models in the modern web. I don't know if this particular research was sponsored by ad-tech or if it's preventive, but it shouldn't be generally surprising that this kind of things are heavily researched.
I don't understand the live demo
it gave me some ID, but how do I test that some different website can track me resulting in same ID?
or is it only "detect private browsing/container on same browser" kind of stuff?
The "supercookie" phenomenon from a few years ago (when this was created) was that despite using private browsing or deleting cookies, the site id remains the same for the same browser.
It could track you between site visits, at a minimum.
Does it work if you disable favicons? (I disabled favicons when I set up the computer, but for a different reason; it is a feature that I don't use.)
If websites can detect that you've disabled favicons, then you are easy to track between all websites because you are very unusual.
I don't think that's true. You'll just look like someone who already has it cached.
It depends on how the browser rejects favicons? If the browser reports the icon is already cached, I agree (assuming the reports are indistinguishable). But maybe it just never downloads the icon, for example.
> If the browser reports the icon is already cached
Browsers don't do that.
Delete cookies and site data on Firefox works.
This is great, I needed more tools for tracking bad users who have been banned and try to ban evade. I have been using Samy Kamkars evercookie which is pretty good but some of the techniques are dated.
did anyone ever make use of this in practice? 32 redirects to construct a unique id seems very impractical
Ad networks don’t care. It’s a data leak. Even a few extra bits can be valuable to tag you with a better uid.
Can’t wait for this to be abused and linked to your digital ID through the wallet app!