When you’re in the hot seat, and someone asks “Who approved this?”, the truthful answer is that no one approved it.
Deleted Comment
When you’re in the hot seat, and someone asks “Who approved this?”, the truthful answer is that no one approved it.
2020: the first AI craze, introducing “Einstein” as their name for their analytics platform, and officially changing the corporate vision to being the “No. 1 AI CRM company”.
2021: Now it’s all about “Customer 360”, i.e. account-based marketing, i.e. what basically everyone else does without such a memeable name. You wouldn’t believe the number of slide decks I had to sit through with all our little product logos orbiting this stock art character straight out of Women Laughing While Eating Salad.
2022: Never mind, now we’re betting the company on a real-time unified database called Genie, which was neither real-time nor unified (and eventually not called Genie either). Got sued for that one.
2024: AGENTS. AGENTS EVERYWHERE. WE ARE AN AGENT COMPANY NOW.
So, let’s see how this holds up in the face of the next hot thing.
And ideally have the whole thing open source and be able to run it in CI
We tried peerdb + clickhouse but Clickhouse materialized views are not refreshed when joining tables.
Right now we’re back to standard materialized views inside Postgres refreshed once a day but the full refreshes are pretty slow… the operational side is great though, a single db to manage.
I don't find this very impressive. Forget LLMs for a second. Let's say _you_ read a question of that kind with some bit of irrelevant information. There are two possibilities you have to consider: the question may as well have excluded the irrelevant information, or the question was miswritten and the irrelevant information was meant to be relevant. The latter is a perfectly live possibility, and I don't think it's a dramatic failure to assume that this is correct. I have to confess that when I read some people's LLM gotcha questions, where they take some popular logic puzzle and invert things, I think I would get them "wrong" too. And not wrong because I don't understand the question, but wrong because with no context I'd just assume the inversion was a typo.
I don't see this as an material limitation of LLMs but rather something that can be addressed at the application level to strip out irrelevant information.
Asking happy team members to review your company is no different than apps asking frequent users to review on the App Store.
I have successfully used DuckDB like above for preparing an ML dataset from about 100GB of input.
DuckDB is undergoing rapid development these days. There have been format-breaking changes and bugs that could lose data. I would not yet trust DuckDB for long-term storage or archival purposes. Parquet is a better choice for that.
https://pedram.substack.com/p/streaming-data-pipelines-with-...
I am somewhat at odds with it being a default extension build into DuckDB release. This still is a feature/product coming from another company than the makers of DuckDB [1], though they did announce a partnership with makers of this UI [2]. Whilst DuckDB has so far thrived without VC money, MotherDuck has (at least) 100M in VC [3].
I guess I'm wondering where the lines are between free and open source work compared to commercial work here. My assumption has been that the line is what DuckDB ships and what others in the community do. This release seems to change that.
Yes, I do like and use nice, free things. And I understand that things have to be paid for by someone. That someone even sometimes is me. I guess I'd like clarification on the future of DuckDB as its popularity and reach is growing.
[1] https://duckdblabs.com
[2] https://duckdblabs.com/news/2022/11/15/motherduck-partnershi...
[3] https://motherduck.com/blog/motherduck-open-for-all-with-ser...
edit: I don't want to leave this negative sounding post here without addendum. I'm just concerned of future monetization strategies and roadmap of DuckDB. DuckDB is a good and useful, versatile tool. I mainly use it from Python through Jupyter, in the browser and native. I haven't felt the need for commercial services (plus purchasing them from my professional setting is too convoluted). This UI whilst undoubtedly useful seems to be leaning towards commercial side. I merely wanted some clarity on what it might entail. I do hope DuckDB and its community even more greater, better things, with requisite compensation for those who work to ensure this.
There is always going to be some overlap between open source contributions and commercial interests but unless a real problem emerges like core features getting locked behind paywalls there is no real cause for concern. If that happens then sure let’s talk about it and raise the issue in a public forum. But for now it is just a nice convenience feature that some people (like me) will find useful.