Readit News logoReadit News
ketozhang commented on Read to forget   mo42.bearblog.dev/read-to... · Posted by u/diymaker
HPsquared · 3 months ago
That's the idea behind "Getting Things Done" (GTD)
ketozhang · 3 months ago
GTD has the addition that you must create a system of reminders/followups. GTD is great to practice being okay with forgetting stuff and trusting your tracking system.
ketozhang commented on Polars Cloud and Distributed Polars now available   pola.rs/posts/polars-clou... · Posted by u/jonbaer
drej · 4 months ago
Having done a bit of data engineering in my day, I'm growing more and more allergic to the DataFrame API (which I used 24/7 for years). From what I've seen over the past ~10 years, 90+% of use cases would be better served by SQL, both from the development perspective as well as debugging, onboarding, sharing, migrating etc.

Give an analyst AWS Athena, DuckDB, Snowflake, whatever, and they won't have to worry about looking up what m6.xlarge is and how it's different from c6g.large.

ketozhang · 3 months ago
I think your argument focuses a lot on the scenario where you already have cleaned data (i.e., data warehouse). I and many other data engineers agree, you're better off with hosting it on SQL RDBMS.

However, before that, you need a lot of code to clean the data and raw data does not fit well into a structured RDBMS. Here you choose to either map your raw data into row view or a table view. You're now left with the choice of either inventing your own domain object (row view) or use a dataframe (table view).

ketozhang commented on Where's the shovelware? Why AI coding claims don't add up   mikelovesrobots.substack.... · Posted by u/dbalatero
noodletheworld · 4 months ago
> people are going to get a variety of results.

Yes, but the point of this article is surely that on average if it's working, there would be obvious signs of it working by now.

Even if there are statistical outliers (ie. 10x productivity using the tools), if on average, it does nothing to the productivity of developers, something isn't working as promised.

ketozhang · 3 months ago
We need long running averages and 2023-2025 is still too early to determine it's not effective. The barriers of entry for 2023 and 2024, I'd argue is too high for inexperienced developers to start churning software. For seasoned developers, the skepticism and company adoption wasn't there yet (and still isn't).
ketozhang commented on Where's the shovelware? Why AI coding claims don't add up   mikelovesrobots.substack.... · Posted by u/dbalatero
ketozhang · 3 months ago
The data is surprising. However, I do wish this article looked carefully into barriers of entry as it can explain the lack of increases in your data.

For example, in Steam, it costs $100 to release a game. You may extend your game with what's called a DLC and that costs $0 to release. If I were to build shovelware with especially with AI-generated content, I'd more keen to make a single game with a bunch of DLC.

For game development, integration of AI into engines is another barrier. There aren't that many choices of engines that gives AI an interface to work with. The obvious interface is games that can be entirely build with code (e.g., pygame; even Godot is a big stretch)

ketozhang commented on uv: An extremely fast Python package and project manager, written in Rust   github.com/astral-sh/uv... · Posted by u/chirau
incognito124 · 6 months ago
uv is almost perfect. my only pet peeve is updating dependencies. sometimes I just want to go "uv, bump all my dependencies to the as latest version as possible while respecting their constraints". I still haven't found an elegant way to do this, but I have written a script that parses pyproject.toml, removes the deps, and invokes `uv add --upgrade` with them.

other than that, it's invaluable to me, with the best features being uvx and PEP 723

ketozhang · 6 months ago
You could either delete the .venv and recreate it or run `uv pip install --upgrade .`

Much prefer not thinking about venvs.

ketozhang commented on The Dunning-Kruger effect is autocorrelation   economicsfromthetopdown.c... · Posted by u/ljosifov
crazygringo · 2 years ago
Yup. Assuming the sample sizes are statistically significant, the original paper clearly shows:

- On average, people estimate their ability around the 65th percentile (actual results) rather than the 50th (simulated random results) -- a significant difference

- That people's self-estimation increases with their actual ability, but only by a surprisingly small degree (actual results show a slight upwards trend, simulated random results are flat) -- another significant difference

The author's entire discussion of "autocorrelation" is a red herring that has nothing to do with anything. Their randomly-generated results do not match what the original paper shows.

None of this really sheds much light on to what degree the results can be or have been robustly replicated, of course. But there's nothing inherently problematic whatsoever about the way it's visualized. (It would be nice to see bars for variance, though.)

ketozhang · 2 years ago
The autocorrelation is important to show that it's transformation to D-K plot will always give you the D-K affect for independent variables.

However, the focus on autocorrelation is not very illuminating. We can explain the behaviors found quite easily:

- If everyone's self-assessment score are (uniformally) random guesses, then the average self-assessment score for any quantile is 50%. Then of course those of lower quantile (less skilled) are overestimating.

- If self-assessment score vs actual score are dependent proportionally, then the average of each quantile is always at least it's quantile value. This is the D-K effect, which is weaker as the correlation grows.

-The opposite is true for disproportional relation.

So, the D-K plot is extremely sensitive to correlations and can easily over-exaggerate the weakest of correlations.

ketozhang commented on The Dunning-Kruger effect is autocorrelation   economicsfromthetopdown.c... · Posted by u/ljosifov
snarkconjecture · 2 years ago
Nonstandard terminology warning: the author is using "autocorrelation" in a way I've never seen before. There is a much more common usage of "autocorrelation" to refer to the correlation of a timeseries with itself (shifted by some amount).

If you use autocorrelation to refer to the thing in OP, you'll probably confuse people who know statistics, and vice versa.

ketozhang · 2 years ago
The more common experience with autocorrelations are with time series, but what the author said is correct even in that context. A time series autocorrelation relates the same time series function at different times. At the simplest you plot the arrays X vs X where X[i] = f(t[i]). You then may complicate it further by some transformation g(X) vs X (e.g., moving average).
ketozhang commented on Fast self-hostable open-source workflow engine   windmill.dev/blog/launch-... · Posted by u/rubenfiszel
ketozhang · 2 years ago
Did you guys considered existing standards when you chose what to use for representing workflow definitions before choosing OpenFlow? For example, Common Workflow Language
ketozhang commented on Use Timestamps   jankremer.eu/micro/timest... · Posted by u/jankremer
lijok · 2 years ago
Few things irk me as much as systems that show you "N hours/minutes/seconds ago" instead of the timestamp. GitHub for example, of all systems, should know better. Trying to write up a report of any sort and not having access to accurate timestamps is very annoying.
ketozhang · 2 years ago
Hover your mouse over those and you should get the absolute date. Some if not many are using time tags.
ketozhang commented on SciPy builds for Python 3.12 on Windows are a minor miracle   labs.quansight.org/blog/b... · Posted by u/todsacerdoti
aj7 · 2 years ago
Does anyone else find the handling of arrays by Python so horrific that they can’t bring themselves to use it?
ketozhang · 2 years ago
It's not popular because you're mostly hearing from the science community who want more features in their array (vector/matrix/tensors).

Why would you want to use C-like arrays in Python anyways?

u/ketozhang

KarmaCake day29April 24, 2018View Original