faizshah (u/faizshah)

faizshah commented on Agentic Development Environment by JetBrains air.dev... · Posted by u/NumerousProcess

rfw300 · 11 days ago

I'd like others' input on this: increasingly, I see Cursor, Jetbrains, etc. moving towards a model of having you manage many agents working on different tasks simultaneously. But in real, production codebases, I've found that even a single agent is faster at generating code than I am at evaluating its fitness and providing design guidance. Adding more agents working on different things would not speed anything up. But perhaps I am just much slower or a poorer multi-tasker than most. Do others find these features more useful?

faizshah · 11 days ago

The parallel agent model is better for when you know the high level task you want to accomplish but the coding might take a long time. You can split it up in your head “we need to add this api to the api spec” “we need to add this thing to the controller layer” etc. and then you use parallel agents to edit just the specific files you’re working on.

So instead of interactively making one agent do a large task you make small agents do the coding while you focus on the design.

faizshah commented on Agentic Development Environment by JetBrains air.dev... · Posted by u/NumerousProcess

faizshah · 11 days ago

Not to be overly negative but I’m kinda disappointed with this and I have been a JetBrains shill for many years.

I already use this workflow myself, just multiple terminals with Claude on different directories. There’s like 100 of these “Claude with worktrees in parallel” UIs now, would have expected some of the common jetbrains value adds like some deep debugger integration or some fancy test runner view etc. The only one I see called out is Local History and I don’t see any fancy diff or find in files deep integration to diff or search between the agent work trees and I don’t see the jetbrains commit, shelf, etc. git integration that we like.

I do like the cursor-like highlight and add to context thing and the kanban board sort of view of the agent statuses, but this is nothing new. I would have expected at the least that jetbrains would provide some fancier UI that lets you select which directories or scopes should be auto approved for edit or other fancy fine grained auto-approve permissions for the agent.

In summary it looks like just another parallel Claude UI rather than a Jetbrains take on it. It also seems like it’s a separate IDE rather than built on the IntelliJ platform so they probably won’t turn it into a plugin in the future either.

faizshah commented on Trifold is a tool to quickly and cheaply host static websites using a CDN jpt.sh/projects/trifold/... · Posted by u/birdculture

0x3f · 14 days ago

I feel like I've tried many similar combos and there ends up being some tiny, silly, trivial thing that bothers me in the end. For example, I remember fighting with one of them that forced trailing slashes, and another that didn't allow apex domains (i.e. non-www address) for static sites.

I absolutely refuse to actually ship valuable things though so thanks for the suggestion and I'll probably spend some time trying it out.

faizshah · 13 days ago

I agree, for me it’s my current weekend project to try to figure out a dirt cheap and high performance self hosted cloud for hosting stuff.

So I’m still sticking with Route53 cause it’s the least annoying registrar and DNS api, for CDN I’m going with bunny and for dirt cheap object storage I’m going with b2.

Then the fun part is the actual self hosting: I’m going with Garage for my normal self hosted S3 api (b2 is for backups etc.), Scylla for DDB, Spin for super fast Wasm FaaS…

Then this weekend I got deep into trying to build my cloudwatch alternative I think I’m going with dumping logs with vector into b2 and then using quickwit for searching the logs.

Just a fun homelab challenge really.

faizshah commented on Trifold is a tool to quickly and cheaply host static websites using a CDN jpt.sh/projects/trifold/... · Posted by u/birdculture

faizshah · 14 days ago

FYI if you want an s3 + CF analogue setup, b2 is integrated with bunny and allows private buckets: https://www.backblaze.com/docs/cloud-storage-integrate-bunny...

I haven’t yet worked out the best cheap VPS/dedicated provider though, project for next weekend.

faizshah commented on Self-hosting a NAT Gateway awsistoohard.com/blog/sel... · Posted by u/veryrealsid

notTooFarGone · 23 days ago

It's honestly ridiculous that people now see that self hosting is stupidly cheaper and still 99.9% reliable.

No your service does not need the extra .099% availability for 100x the price...

Make your own VPN while you are at it, wireguard is basically the same config.

faizshah · 23 days ago

I think AI coding is another part of why this is seeing a resurgence. It’s a lot quicker to build quick and dirty scripts or debug the random issues that come up self hosting.

faizshah commented on 650GB of Data (Delta Lake on S3). Polars vs. DuckDB vs. Daft vs. Spark dataengineeringcentral.su... · Posted by u/tanelpoder

throwaway-aws9 · a month ago

650GB? Your data is small, fits on my phone. Dump the hyped tooling and just use gnu tools.

Here's an oldie on the topic: https://adamdrake.com/command-line-tools-can-be-235x-faster-...

faizshah · a month ago

This isn’t true anymore we are way beyond 2014 Hadoop (what the blog post is about) at this point.

Go try doing an aggregation of 650gb of json data using normal CLI tools vs duckdb or clickhouse. These tools are pipelining and parallelizing in a way that isn’t easy to do with just GNU Parallel (trust me, I’ve tried).

faizshah commented on 650GB of Data (Delta Lake on S3). Polars vs. DuckDB vs. Daft vs. Spark dataengineeringcentral.su... · Posted by u/tanelpoder

faizshah · a month ago

I had to do something like this for a few TB of json recently. The unique thing about this workload was it was a ton of small 10-20mb files.

I found that clickhouse was the fastest, but duckdb was the simplest to work with it usually just works. DuckDB was close enough to the max performance from clickhouse.

I tried flink & pyspark but they were way slower (like 3-5x) than clickhouse and the code was kind of annoying. Dask and Ray were also way too slow, but dask’s parallelism was easy to code but it was just too slow. I also tried Datafusion and polars but clickhouse ended up being faster.

These days I would recommend starting with DuckDB or Clickhouse for most workloads just cause it’s the easiest to work with AND has good performance. Personally I switched to using DuckDB instead of polars for most things where pandas is too slow.

faizshah commented on Migrating from AWS to Hetzner digitalsociety.coop/posts... · Posted by u/pingoo101010

faizshah · 2 months ago

Can anyone recommend a good cloud for GPU instances?

Was trying to find a good one for 30B quants but there’s so many now and the pricing is all over the place.