Show HN: I'm building a browser for reverse engineers

Very interesting, thanks!

For the fingerprinting part, can you explain the difference with the JShelter browser extension (https://jshelter.org/)?

I checked as you did in your demo video with https://demo.fingerprint.com/playground (using JShelter in Firefox). It produces a fingerprint detector report, like so :

{

    "fpd_evaluation_statistics": [
        {
            "title": "Navigator.prototype.plugins",
            "type": "resource",
            "resource": "get",
            "group": "BrowserProperties",
            "weight": 0,
            "accesses": 0
        },
        {
            "title": "MediaDevices.prototype.enumerateDevices",
            "type": "resource",
            "resource": "call",
            "group": "BrowserProperties",
            "weight": 1,
            "accesses": 2
        },
        [...]

}

However, it appears there is no way to display what was actually produced by the browser.

Was this the reason you had to build your own browser? Or is it possible to extend JShelter to do the same?

nullpt_rs · 5 months ago

Ooh nice, I haven’t seen this project! I actually tried attempting this as an extension at first but wasn’t able to override page window functions. I’m curious to know how they accomplished this. (edit: I see that I missed the chrome.scripting API facepalm)

Thank you for sharing :)

FWIW I still think a custom browser approach has some benefits (stealth and executing in out of process iframes. could be wrong on the second part, haven’t actually tested!)

In the past I've considered forking Chromium so every asset that it downloads (images, scripts, etc) is saved somewhere to produce a sort of "passive scraper".

This article made me consider creating a new CDP domain as a possible option, but tbf I haven't thought about this problem in ages so maybe there's something less stupid that I could do.

debazel · 5 months ago

Ha, I've had the exact same thought before as well, but due to lack of experience and time constraints I ended up using mitmproxy with a small Python script instead. It was slow and buggy, but it served it purpose...

While searching for a tool I found several others asking for something similar, so I'm sure there are quite a few who would be interested in the project if you ever do decide to pick it up.

dunham · 5 months ago

It's not quite the same, but in the past I've written (in python) scrapers that run off of the cache. E.g. it would extract recipes from web pages that I had visited. The script would run through the cache and run an appropriate scraper based on the url. I think I also looked for json-ld and microdata.

The down sides were that it only works with cached data, and I had to tweak it a couple of times because they changed the format of the cache keys.

tducret · 5 months ago

leptons · 5 months ago

Most of my job is reverse engineering a major website builder company's code so we can leverage their undocumented features. It's often a difficult job but your project could make it easier. I'm sure there are others out there that will find this useful.

codingcodingboy · 5 months ago

Sounds interesting, what can you achieve?

codeulike · 5 months ago

resworb nwo ym detnaw syawla ev'i dna reenigne esrever a m'I

dotancohen · 5 months ago

This isn't rot13.

EDIT: Oh, it took me a minute!

egberts1 · 5 months ago

Abj vg vf.

dlcarrier · 5 months ago

.ƨbɿɒwʞɔɒd ɘɿɒ ƨɿɘɟɟɘl ɿuoY

noitcelfer diova ot yrt I edoc ym nI

3abiton · 5 months ago

This is such an eye opening, and really interesting. It reminded me of projects like XprivacyLua that "expose" the different calls and request from android apps. Great work!

sagistrauss · 5 months ago

Nice work! Check out visible v8: https://github.com/wspr-ncsu/visiblev8 for inspiration on using the V8 debug logs.

evertedsphere · 5 months ago

Alifatisk · 5 months ago

Love this blog, still waiting on part 2 of Reverse Engineering Tiktoks VM

MaxLeiter · 5 months ago

"toString theory" is an incredible title for that section