Readit News logoReadit News
banditelol commented on Show HN: Donut Browser, a Browser Orchestrator   donutbrowser.com/... · Posted by u/andrewzeno
squeegee_scream · 3 months ago
Related: https://github.com/johnste/finicky, “A macOS app for customizing which browser to start”. Write a JSON file to tell it when to open a link in a certain browser, to strip certain strings like utm codes, etc
banditelol · 3 months ago
I tried this before, but since I often need to open different browser even if a link came from the same app, I ended up moving to https://github.com/will-stone/browserosaurus

Not to say you cant use both tho

banditelol commented on Fivetran to acquire Census   fivetran.com/blog/why-fiv... · Posted by u/njaremko
zoogeny · 4 months ago
All of these tools are insanely expensive (from my own experience at companies that have used them). I understand it, since building your own pipeline to handle the kind of throughput analytics takes is expensive and time consuming. Business leaders want the visibility but don't want to redirect dev resources to build and maintain these creaky data pipelines. It is the perfect market of high-value and low tolerance for build (on the build or buy spectrum).

But I am not going to pay $1000/month as a bootstrap startup. What open source alternatives exist that can be run on basic hardware?

banditelol · 4 months ago
I've tried airbyte, sling, and dlt (besides building several tools from scratch)

My best bet for now will be dlt if you have dedicated DE team, but sling will get you a long way for moving data around your warehouse

banditelol commented on Fivetran to acquire Census   fivetran.com/blog/why-fiv... · Posted by u/njaremko
mritchie712 · 4 months ago
The best open source options are Airbyte and Meltano / Singer. But it's hard to keep them running. If you self-host them, you'll hit issues at least a few times a month which can each take a few hours to solve.

It's not like running Postgres which "just works". When you self-host Airbyte, you're still building a good bit.

I felt the same way about the cost of data tools. Paying $1,000 for Fivetran, $2,000 for Snowflake, $2,000 for Looker seemed crazy. We bundle all three for $500 / month at https://www.definite.app

banditelol · 4 months ago
Hi, I've been loking something like this! Any of your custumer has success story migrating off bigquery to your platform? And how do you compare to motherduck? (Looks like you built some of ypur stack on top of duckdb)
banditelol commented on Show HN: Benchmarking VLMs vs. Traditional OCR   getomni.ai/ocr-benchmark... · Posted by u/themanmaran
banditelol · 6 months ago
Anyone have tried comparing with Qwen VL based model? I heard good things about its performance on ocr compared to other self hostable model, but haven't really tried benchmarking its performance
banditelol commented on Show HN:Free Online Tool to Experience Microsoft's MarkItdown   markitdown.pro... · Posted by u/kianworkk
klabetron · 8 months ago
> Your files are processed directly in your browser, and no files are uploaded to our servers or stored by us.

How does it run a Python library entirely browser side? Just curious.

(Given the faff of setting up a Python environment, this is a great idea.)

banditelol · 8 months ago
Now you make me wonder if I could run this entirely inside pyscript
banditelol commented on Demystifying Git Submodules   cyberdemon.org/2024/03/20... · Posted by u/signa11
abdullahkhalids · 9 months ago
I am running simulations with a rapidly evolving codebase. I have a separate repo with all the simulation code in it. I am want to tie each simulation with the git commit (of the main repo) at which it was run. Are git submodules the correct solution to this in any way?
banditelol · 9 months ago
I think you want something aling the line of dvc (github.com/iterative/dvc)
banditelol commented on Show HN: BemiDB – Postgres read replica optimized for analytics   github.com/BemiHQ/BemiDB... · Posted by u/exAspArk
banditelol · 10 months ago
Looking at the syncer it seems like copying data to csv from the whole table everytime (?) Code: https://github.com/BemiHQ/BemiDB/blob/6d6689b392ce6192fe521a...

I cant imagine until at what scale can you do this and is there anything better we can do before using debezium to sync the data via cdc?

Edit: add code permalink

banditelol commented on Cats are (almost) liquid   cell.com/iscience/fulltex... · Posted by u/lnyan
accrual · 10 months ago
I love C&H and am blown away there was something so applicable. Felt like an XKCD moment!
banditelol · 10 months ago
Lol I automatically read C&H as Cyanide and Happiness
banditelol commented on Understanding the Limitations of Mathematical Reasoning in LLMs   arxiv.org/abs/2410.05229... · Posted by u/hnhn34
golol · a year ago
I'm a math phd student at the moment and I regularly use o1 to try some quick calculations I don't feel like doing. While I feel like GPT-4o is so distilled that it just tries to know the answer from memory, o1 actually works with what you gave it and tries to calculate. It's can be quite useful.
banditelol · a year ago
I'm curious what kind of quick calculation do you usually use llm for?

Edited for clarity

Dead Comment

u/banditelol

KarmaCake day19June 25, 2019
About
meet.hn/city/-6.9215529,107.6110212/Bandung

Socials: - github.com/banditelol

---

View Original