For the fingerprinting part, can you explain the difference with the JShelter browser extension (https://jshelter.org/)?
I checked as you did in your demo video with https://demo.fingerprint.com/playground (using JShelter in Firefox).
It produces a fingerprint detector report, like so :
Ooh nice, I haven’t seen this project! I actually tried attempting this as an extension at first but wasn’t able to override page window functions. I’m curious to know how they accomplished this. (edit: I see that I missed the chrome.scripting API facepalm)
Thank you for sharing :)
FWIW I still think a custom browser approach has some benefits (stealth and executing in out of process iframes. could be wrong on the second part, haven’t actually tested!)
Most of my job is reverse engineering a major website builder company's code so we can leverage their undocumented features. It's often a difficult job but your project could make it easier. I'm sure there are others out there that will find this useful.
This is such an eye opening, and really interesting. It reminded me of projects like XprivacyLua that "expose" the different calls and request from android apps. Great work!
In the past I've considered forking Chromium so every asset that it downloads (images, scripts, etc) is saved somewhere to produce a sort of "passive scraper".
This article made me consider creating a new CDP domain as a possible option, but tbf I haven't thought about this problem in ages so maybe there's something less stupid that I could do.
Ha, I've had the exact same thought before as well, but due to lack of experience and time constraints I ended up using mitmproxy with a small Python script instead. It was slow and buggy, but it served it purpose...
While searching for a tool I found several others asking for something similar, so I'm sure there are quite a few who would be interested in the project if you ever do decide to pick it up.
It's not quite the same, but in the past I've written (in python) scrapers that run off of the cache. E.g. it would extract recipes from web pages that I had visited. The script would run through the cache and run an appropriate scraper based on the url. I think I also looked for json-ld and microdata.
The down sides were that it only works with cached data, and I had to tweak it a couple of times because they changed the format of the cache keys.
For the fingerprinting part, can you explain the difference with the JShelter browser extension (https://jshelter.org/)?
I checked as you did in your demo video with https://demo.fingerprint.com/playground (using JShelter in Firefox). It produces a fingerprint detector report, like so :
{
}However, it appears there is no way to display what was actually produced by the browser.
Was this the reason you had to build your own browser? Or is it possible to extend JShelter to do the same?
Thank you for sharing :)
FWIW I still think a custom browser approach has some benefits (stealth and executing in out of process iframes. could be wrong on the second part, haven’t actually tested!)
EDIT: Oh, it took me a minute!
This article made me consider creating a new CDP domain as a possible option, but tbf I haven't thought about this problem in ages so maybe there's something less stupid that I could do.
While searching for a tool I found several others asking for something similar, so I'm sure there are quite a few who would be interested in the project if you ever do decide to pick it up.
The down sides were that it only works with cached data, and I had to tweak it a couple of times because they changed the format of the cache keys.