anktor (u/anktor) - Readit News

anktor commented on I took all my projects off the cloud, saving thousands of dollars rameerez.com/send-this-ar... · Posted by u/sebnun

pnutjam · 2 months ago

The most consistent misunderstanding I see about the cloud, is disk I/O. Nobody understands how slow your standard cloud disk is under load. They see good performance and assume that will always be the case. They don't realize that most cloud disks use a form of token tracking where they build up I/O over time and if you have bursts or sustained high I/O load you will very quickly notice that your disk speeds are garbage.

For some reason people more easily understand the limits of CPU and memory, but overlook disk constantly.

anktor · 2 months ago

What could I read to inform myself better on this topic? It is true I had not seen this angle before

anktor commented on Pg_lake: Postgres with Iceberg and data lake access github.com/Snowflake-Labs... · Posted by u/plaur782

mslot · 2 months ago

DuckLake is pretty cool, and we obviously love everything the DuckDB is doing. It's what made pg_lake possible, and what motivated part of our team to step away from Microsoft/Citus.

DuckLake can do things that pg_lake cannot do with Iceberg, and DuckDB can do things Postgres absolutely can't (e.g. query data frames). On the other hand, Postgres can do a lot of things that DuckDB cannot do. For instance, it can handle >100k single row inserts/sec.

Transactions don't come for free. Embedding the engine in the catalog rather than the catalog in the engine enables transactions across analytical and operational tables. That way you can do a very high rate of writes in a heap table, and transactionally move data into an Iceberg table.

Postgres also has a more natural persistence & continuous processing story, so you can set up pg_cron jobs and use PL/pgSQL (with heap tables for bookkeeping) to do orchestration.

There's also the interoperability aspect of Iceberg being supported by other query engines.

anktor · 2 months ago

What does data frames mean in this context? I'm used to them in spark or pandas but does this relate to something in how duckDB operates or is it something else?

anktor commented on Bear is now source-available herman.bearblog.dev/licen... · Posted by u/neoromantique

sgc · 4 months ago

I often think the solution is to move away from crafting a perfect ideology to encapsulate in your license, and just throw out some numbers. If you make more than N* the median income of this or that place, you can't use this software for free (whether that means licensing fees, code contribution, etc can vary). Let the smaller fish grow. If they get big enough, they can give back.

anktor · 4 months ago

How would that ever be enforced? Do you not run into WinRAR/Sublime problem of "ey you've been using this, pay us, please"?

anktor commented on Flunking my Anthropic interview again taylor.town/flunking-anth... · Posted by u/surprisetalk

user_7832 · 4 months ago

> If you're not getting offers, I strongly recommend that you find somebody you trust to do a mock interview. Let them critique your resume, cover letter, posture, awkwardness, lame handshake, etc.

Slightly odd question but: what if it's the opposite of this?

Interviews are almost never an issue.

I would like to think (and have been told so too) that I'm both technically sharp and knowledgeable enough, and can communicate well enough. I have a firm handshake, and thanks to the ability to happily dive into topics I read up on, I can speak confidently - both on hard facts, as well as my understanding or opinion of any technical matter in my field - for hours maybe, if not longer.

But getting the interview... is.. legitimately hard. Multiple people have said my resume is quite solid, but I rarely get through beyond the base round.

Would you have any tips for just the act of getting a foot in the door, so to say? I'm reasonably optimistic I can take it from there.

(Two things I can probably change - using customized CVs (and a cover letter, where applicable), and reaching out to employees/HR at the places I'm applying at. Though that honestly seems exhausting with so many applications...)

anktor · 4 months ago

Without any context of culture or country, just trying to be helpful: in my limited (<20 total interviews) experience, I would think about budget issues.

Meaning, what you ask for (or how expensive you are perceived, if you have that strong resumee) for the industry you apply, may be too different and leading to limited access.

Sometimes I feel junior people have it easier (I felt like I did, personally) since the expense in salary is pretty limited compared to either other roles or more senior people

anktor commented on Databricks is raising a Series K Investment at >$100B valuation databricks.com/company/ne... · Posted by u/djhu9

uxcolumbo · 4 months ago

Are there any cheaper alternatives to Databricks, EC2, DynamoDB, S3 solution? Where cost is more predictable and controlled?

What's a good roll your own solution? DB storage doesn't need to be dynamic like with DynamoDB. At max 1TB - maybe double in the future.

Could this be done on a mid size VPS (32GB RAM) hosting Apache Spark etc - or better to have a couple?

P.S. total beginner in this space, hence the (naive) question.

anktor · 4 months ago

It's been mentioned but I want to add that the original idea of the post (mid size VPS hosting apache spark) might be missing that spark is ideal for distributed and resilient work (if a node fails the framework is able to avoid losing that work).

If you don't need this features, specially the distributed one, going tall (single instance with high capacity, replicate when necessary) or going simpler (multiple servers but without spark coordinating the work) could be good options depending on your/the team's knowledge

anktor commented on Cursor 1.0 cursor.com/en/changelog/1... · Posted by u/ecz

sandos · 7 months ago

No, he basically means thay companies will not allow LLMs on their own code, I think.

I work in a multinational conglomerate, and we got AI allowed ... 2-3 weeks ago. Before that it was basically banned unless you had gotten permission. We did have another gpt4 based AI in the browser available for a few months before that as well.

anktor · 7 months ago

Correct. I don't want to circumvent rules but sometimes it feels like falling behind, like for reviewing MRs.

anktor commented on Cursor 1.0 cursor.com/en/changelog/1... · Posted by u/ecz

anktor · 7 months ago

Does anyone have experience with using this or another agent on local files? No company I know of will approve this for their owned repositories.

What about Gitlab instead of GitHub, is there an equivalent to cursor 1.0 product?

anktor commented on Sleep is essential – researchers are trying to work out why nature.com/articles/d4158... · Posted by u/sohkamyung

schwartzworld · 9 months ago

Easy to get, hard to get used to for many. It’s common for doctors to offer little to no support after prescribing the machine.

anktor · 9 months ago

How would support look like in this situation? Honest question I simply don't know

anktor commented on Johnny.Decimal – A system to organise your life johnnydecimal.com... · Posted by u/debone

hn_throwaway_99 · 10 months ago

Just a general observation as someone nearing 50. I'm honestly very curious to see if someone has had a different experience than me. I'm am, to put it mildly, not an "organized person". I have tried a million different systems throughout my life - GTD, Inbox Zero, spreadsheets, etc. etc.

To be honest, I don't believe that any of these "organization systems" really help people that have problems being organized in the first place. I think it's just a fundamentally different way of how I'm wired. My general conclusion is that trying to "fight" my natural way of doing things is always going to be a losing battle, and that instead I just need to figure out ways to handle my general messiness and get it to work for me. I mean, I can certainly be organized for sizable stretches of time, but whenever I start getting pressed for time, or stressed, or lose my motivation for some other reason, it always reverts to the mean.

I'd honestly be really interested to hear if anyone has ever changed from being a "unorganized person" to an "organized person", because it my few decades of life I've never seen it be successfully accomplished.

anktor · 10 months ago

Recently I have been thinking about this, because I feel I have managed to become way more organized than I ever thought it was possible.

What is working for me right now is noting everything in a calendar so I cannot forget it or as TODO in a somewhat heavy personalized Obsidian configuration.

A few years ago (5-6 aprox) I started copying my older co-workers habits to see myself improve. Physical notebooks were soon discarded because I never remember where I wrote down things.

I used a TODO plugin in sublime which worked for several months, until I felt I needed screenshots so I moved to OneNote. After a while I became frustrated with not being able to customize it enough, so I started trying out different things. I saw a coworker using Obsidian, watched a couple long YouTube videos to learn how to customize, and I'm never going back.

My team this week told me they are impressed with how much info I write down and it was a very proud moment for me!

anktor commented on Apache Iceberg iceberg.apache.org/... · Posted by u/jacobmarble

mritchie712 · a year ago

Why don't you want a catalog? The SQL or REST catalogs are pretty light to set up. I have my eye on lakekeeper[0], but Polaris (from Snowflake) is a good option too.

PyIceberg is likely the easiest way to write without Spark.

0 - https://github.com/lakekeeper/lakekeeper

anktor · a year ago

PyIceberg is nice but we had to drop it because it's behind Java API and it's unclear when it will match up, so depending on which features are needed I'd look it up