Readit News logoReadit News
volderette commented on Apache Iceberg   iceberg.apache.org/... · Posted by u/jacobmarble
jl6 · 7 months ago
Why away from bigquery? Just wondering if it’s a cost thing.
volderette · 7 months ago
Yes, mainly driven by cost. BigQuery is really unpredictable when dashboards with filters are being used intensively by users. We don’t want to limit our users in their data exploration.
volderette commented on Apache Iceberg   iceberg.apache.org/... · Posted by u/jacobmarble
volderette · 7 months ago
How do you query your iceberg tables? We are looking into moving away from Bigquery and Starrocks [1] looks like a good option.

[1] https://www.starrocks.io/

volderette commented on Show HN: ReverseETL – The open-source alternative to Hightouch and Census   github.com/Multiwoven/mul... · Posted by u/nagstler
nagstler · a year ago
This project will always remain open-source.
volderette · a year ago
That sounds great thanks!
volderette commented on Show HN: ReverseETL – The open-source alternative to Hightouch and Census   github.com/Multiwoven/mul... · Posted by u/nagstler
volderette · a year ago
Will this stay open source or will it end up as another limited open core data product?
volderette commented on Is the "modern data stack" still a useful idea?   roundup.getdbt.com/p/is-t... · Posted by u/tim_sw
staticautomatic · 2 years ago
I'm a newly minted head of analytics who transitioned from a different domain, so I never had to muck my way through the MDS but attentively watched others from the sidelines over the last few years. Best I can tell, "the modern data stack" is just a marketing phrase invented by a cadre of vampire vendors. The lessons I learned watching others translated into a few simple requirements for our nascent "stack" that most importantly include transparent pricing I can reason about and divvy up, as many integrations as possible so I can minimize rolling my own, and a straightforward framework for ETL code. These three requirements plainly disqualify most of the MDS universe.

With the benefit of starting basically from scratch and not having to mess around with real-time analytics, it's pretty easy to ignore the MDS vendors. So far I've landed on BigQuery, AirByte, GitHub, BI Engine, Looker Studio, and Pandas 2.x or DuckDB for local stuff. I send as many things as possible straight to BQ, lock junior analysts out of gigantic tables, archive periodically to partitioned parquet files in cold storage, use mostly turnkey integrations, and ruthlessly prioritize custom ETL jobs. Putting GitHub in the mix isn't super ergonomic and we may be in the market for new tools once we cross the "big data" frontier, but that'll be a while from now. I'll probably never know or care what the MDS vendors think I'm missing.

volderette · 2 years ago
Airbyte is definitely one of the MDS vendors. Plus they have a ton of bugs because their only focus is having the most connectors on the market. A lot of them are broken or badly implemented.
volderette commented on Ask HN: Dbt (data built tool) alternatives    · Posted by u/mateuszklimek
volderette · 2 years ago
There is sqlmesh which implements some nice concepts like blue green deployments.

https://sqlmesh.readthedocs.io/en/latest/

volderette commented on Plane got to top spot in project management on GitHub in less than a year   plane.so/blog/how-we-got-... · Posted by u/emreb
luke-stanley · 2 years ago
Okay it's open source but self-hosting it is not straightforward. The repo's self-hosting doc link returns a 404. Then after manually finding https://docs.plane.so/self-hosting/self-hosting, I am warned that there is a dizzying array of 4 env files. I suppose they are in a tricky situation, in that while they want to stand out as open source, actually having people easily self-host it, is perhaps, not a goal that is currently in their interest! Correction: removed wrong Docker-Compose command interpretation, as I have been schooled!
volderette · 2 years ago
Docker integrated docker-compose a while ago, so the command is correct.
volderette commented on Launch HN: Serra (YC S23) – Open-core, Python-based dbt alternative    · Posted by u/Alanhlwang
iamjk · 2 years ago
Finally a competitor to dbt. the world needs this!
volderette · 2 years ago
There is also sqlmesh (https://sqlmesh.com/). Pretty new as well. It introduces some interesting concepts. For smaller dbt projects it could be a drop-in replacement as it allows importing dbt projects.
volderette commented on My DIY ergonomic travel workstation with aluminum and magnets   thume.ca/2022/11/06/diy-t... · Posted by u/trishume
reacharavindh · 3 years ago
I use to be able to read Reddit links on my phone browser, now they simply it a useless banner preventing me from being able to even read the thing unless I download their tracker. Now, I need to train my brain to never click on Reddit links when reading HN on phone :-(
volderette · 3 years ago
There are browser extensions like PrivacyRedirect (iOS app) and LibRedirect (Chrome, Firefox …), that have the ability to redirect you to alternative reddit frontends (teddit and so on).
volderette commented on Google Tag Manager, the new anti-adblock weapon (2020)   chromium.woolyss.com/f/HT... · Posted by u/thyrox
x0x0 · 4 years ago
Current GTM, configured (via the server UI) to inject tracker X:

gtm javascript loads, pulls down the config, injects tracker X javascript into the browser

new gtm:

gtm javascript loads, pulls down config, streams events to google servers to fan out to tracker X as configured

So blocking gtm.js off tagmanager.google.com / www.googletagmanager.com / the various other domains still blocks all gtm injected tags.

The tl;dr is they're become much closer to segment -- which does the data fanout internally to segment. But they should still be straightforward to block.

volderette · 4 years ago
This is not how GTM server side works. There is not a single call to Google domains from the client, when GTM server side is set up to its fullest. The config (gtm.js) will be loaded from my subdomain and not googletagmanager.com. Also gtm.js can be renamed.

u/volderette

KarmaCake day25August 27, 2019View Original