fareesh (u/fareesh) - Readit News

fareesh commented on Google will allow only apps from verified developers to be installed on Android 9to5google.com/2025/08/25... · Posted by u/kotaKat

epolanski · 2 hours ago

I know literally 0, 0 people who have installed malwares or had their smartphones hacked in their life times.

The very few I know that have had this happen where all computer users, and virtually all victims of social hacking such as "hey, I'm from IT department, sending you an email, could you please...". A friend of mine exposed sensible data of thousands of customers of her bank like this.

fareesh · 2 hours ago

it's very common in india

fareesh commented on Gemma 3 270M: Compact model for hyper-efficient AI developers.googleblog.com... · Posted by u/meetpateltech

potato-peeler · 10 days ago

> then you can do some processing or just hand over all the chunks as context saying "here are some documents use them to answer this question" + your query to the llm

This part is what I want to understand. How does the llm “frame” an answer?

fareesh · 6 days ago

I guess you could just try an equivalent in chatgpt or gemini or something. Paste 5 text files one after the other in some structured schema that includes metadata and ask a question. You can steer it with additional instructions like mention the filename etc etc.

fareesh commented on Gemma 3 270M: Compact model for hyper-efficient AI developers.googleblog.com... · Posted by u/meetpateltech

potato-peeler · 11 days ago

This may not be directly related to llm but I am curious about two things -

1. How do llm/rag generate an answer given a list of documents and a question? I can do bm25 to get a list of documents, but post that what is logic/algorithm which generates answers given those list?

2. For small models like this, how much data you need to fine tune for a specific use case? For eg, if I need this model to be knowledgable about html/css, then I have access to lot of documentation online that I can feed it. But if it is very specific topic, like types of banana, then it may be only a couple of wikipedia pages. So is fine tuning directly dependant on the quantity of data alone?

fareesh · 11 days ago

short answer is that in rag systems the documents are chunked into some predefined size (you can pick a size based on your use-case) and the text is converted into vector embeddings (e.g. use the openai embed API) and stored in a vector database like chroma or pinecone or pg_vector in postgres

then your query is converted into embeddings and the top N chunks are returned via similarity search (cosine or dot product or some other method) - this has advantages over bm25 which is lexical

then you can do some processing or just hand over all the chunks as context saying "here are some documents use them to answer this question" + your query to the llm

fareesh commented on Auf Wiedersehen, GitHub github.blog/news-insights... · Posted by u/ben_hall

sebstefan · 15 days ago

>guiding us into the age of Copilot and AI, it has been the ride of a lifetime.

Cool. Can we get faster load times on that mess of a SPA now instead of more AI stuff?

fareesh · 15 days ago

isnt it a rails mpa with some modern layers on top?

fareesh commented on 4k NASA employees opt to leave agency through deferred resignation program kcrw.com/news/shows/npr/n... · Posted by u/ProAm

lawlessone · a month ago

They'll do this now and then act shocked when some of these ex nasa people end up working in Europe or China lol

fareesh · a month ago

don't they all sign a document saying it's all classified tech?

fareesh commented on Hand: open-source Robot Hand github.com/pollen-robotic... · Posted by u/vineethy

fareesh · a month ago

what are some automation use-cases for something like this?

feels like a cool project/toy

fareesh commented on I use zip bombs to protect my server idiallo.com/blog/zipbomb-... · Posted by u/foxfired

fareesh · 4 months ago

Is there a list of popular attack vector urls located somewhere? I want to just auto-ban anyone sniffing for .env or ../../../../ etc.

Rather not write it myself

fareesh commented on How I blog with Obsidian, Hugo, GitHub, and Cloudflare ingau.me/blog/how-i-write... · Posted by u/ingav

fareesh · 4 months ago

I have a similar setup - my post goes into some more details on getting a few things working like embedded tweets etc

https://fareesh.com/post/config-to-content/

fareesh commented on Whistleblower: DOGE Siphoned NLRB Case Data krebsonsecurity.com/2025/... · Posted by u/whalesalad

fareesh · 4 months ago

i continue to be amazed by the l33t h4x0rs who are caught because they forgot they have russian ips

fareesh commented on Show HN: Chonky – a neural approach for text semantic chunking github.com/mirth/chonky... · Posted by u/hessdalenlight

fareesh · 4 months ago

The non english space in these fields is so far behind in terms of accuracy and reliability, it's crazy