vivahir215 (u/vivahir215)

vivahir215 commented on Hyperparameter Tuning Is a Resource Scheduling Problem jchandra.com/posts/hyperp... · Posted by u/jchandra

jchandra · 4 months ago

Totally fair point — at the end of the day, it's all about getting the best model performance. I was mostly trying to highlight how, under the hood, a lot of modern HPO algos really boil down to smart scheduling decisions.

vivahir215 · 4 months ago

Total Computational Budget in Hyperband need to be elaborated. There are more things to it.

vivahir215 commented on Hyperparameter Tuning Is a Resource Scheduling Problem jchandra.com/posts/hyperp... · Posted by u/jchandra

vivahir215 · 4 months ago

I dont think its just a resource scheduling problem, there are more to it as well. The goal is Model Performance, not just efficient resource use.

Nice article but I can see your point.

vivahir215 commented on AI Supply Chain Attack: How Malicious Pickle Files Backdoor Models jchandra.com/posts/python... · Posted by u/jchandra

vivahir215 · 5 months ago

You could use https://github.com/trailofbits/fickling for analysis.

vivahir215 commented on How Pickle Files Backdoor AI Models jchandra.com/posts/python... · Posted by u/jchandra

jchandra · 6 months ago

pytorch save/load still are pickle based models. Its fine for trusted sources but when you start using from untrusted sources then there is always a risk of ACE. If you want to execute it, would suggest to try it in a sandbox env like docker, VM or online notebooks envs or other option is to inspect the model file.

As Open source AI booms, the risk of supply chain attacks also increases.

vivahir215 · 6 months ago

Cool.

vivahir215 commented on How Pickle Files Backdoor AI Models jchandra.com/posts/python... · Posted by u/jchandra

jchandra · 6 months ago

joblib is not fully secure because it still relies on Pickle internally. The reason it is slightly better in pickle is due to fact that pickle file gets immediately executed when it gets imported whereas joblib doesn’t execute code just by being imported.

vivahir215 · 6 months ago

ah okay. Didnt know this. I generally use pytorch save models for my workflow.

vivahir215 commented on How Pickle Files Backdoor AI Models jchandra.com/posts/python... · Posted by u/jchandra

vivahir215 · 6 months ago

Nice read !

You could also use joblib format as well.

vivahir215 commented on We built a modern data stack from scratch and reduced our bill by 70% jchandra.com/posts/data-i... · Posted by u/jchandra

jchandra · 6 months ago

As for BigQuery, while it's a great tool, we faced challenges with high-volume, small queries where costs became unpredictable as it is priced per data volume scanned. Clustered tables, Materialised views helped to some extent, but they didn’t fully mitigate the overhead for our specific workloads. There are ways to overcome and optimize it for sure so i wouldn't exactly put it on GBQ or any limitations.

It’s always a trade-off, and we made the call that best fit our scale, workloads, and long-term plans

vivahir215 · 6 months ago

Hmm, Okay.

I am not sure if managing kafka connect cluster in too expensive in long term. This solution might work for you based on your needs. I would suggest to look for alternatives.

vivahir215 commented on We built a modern data stack from scratch and reduced our bill by 70% jchandra.com/posts/data-i... · Posted by u/jchandra

vivahir215 · 6 months ago

Good read.

I do have a question on the BigQuery. i f you were experiencing unpredictable query costs or customization issues, that sounds like user error. There are ways to optimize or commit slots for reducing the cost. Did you try that ?

u/vivahir215

KarmaCake day3March 9, 2025View Original