danielcampos93 (u/danielcampos93)

danielcampos93 commented on Generative AI hype peaking? bjornwestergard.com/gener... · Posted by u/bwestergard

Nvidia is up 24% in last 1 year compared to <10% for Nasdaq or S&P. Cherry picking the point to compare to is bad.

they also 10xed their revenue. 24 seems low for someone that pulled that off.

danielcampos93 commented on Show HN: Benchmarking VLMs vs. Traditional OCR getomni.ai/ocr-benchmark... · Posted by u/themanmaran

danielcampos93 · 6 months ago

GPT-4o as a judge to evaluate the quality of something which gpt4o is not inherently that good at. Red flag.

danielcampos93 commented on Perplexity Deep Research perplexity.ai/hub/blog/in... · Posted by u/vinni2

CSMastermind · 7 months ago

I'm super happy that these types of deep research applications are being released because it seems like such an obvious use case for LLMs.

I ran Perplexity through some of my test queries for these.

One query that it choked hard on was, "List the college majors of all of the Fortune 100 CEOs"

OpenAI and Gemini both handle this somewhat gracefully producing a table of results (though it takes a few follow ups to get a correct list). Perplexity just kind of rambles generally about the topic.

There are other examples I can give of similar failures.

Seems like generally it's good at summarizing a single question (Who are the current Fortune 100 CEOs) but as soon as you need to then look up a second list of data and marry the results it kind of falls apart.

danielcampos93 · 7 months ago

does it do the full 100? In my experience anything around many items that needs to be exhaustive (all states, all fortune 100) tends to miss a few.

danielcampos93 commented on Perplexity Deep Research perplexity.ai/hub/blog/in... · Posted by u/vinni2

exclipy · 7 months ago

Not true at all. The original ChatGPT was useless other than as a curious entertainment app.

Perplexity, OTOH, has almost completely replaced Google for me now. I'm asking it dozens of questions per day, all for free because that's how cheap it is for them to run.

The emergence of reliable tool use last year is what has sky-rocketed the utility of LLMs. That has made search and multi-step agents feasible, and by extension applications like Deep Research.

danielcampos93 · 7 months ago

It's not free because it's cheap for them to run. It's free because they are burning that late-stage VC dollars. Despite what you might believe if you only follow them on twitter the biggest input to their product, aka a search index, is mostly based on brave/bing/serpAPI and those numbers are pretty tight. Big expectations for ads will determine what the company does.

danielcampos93 commented on WhisperNER: Unified Open Named Entity and Speech Recognition arxiv.org/abs/2409.08107... · Posted by u/timbilt

will-burner · 9 months ago

Is there any reason why this would work better or is needed compared to taking audio and 1. doing ASR with whisper for instance 2. applying an NER model to the transcribed text?

There are open source NER models that can identify any specified entity type (https://universal-ner.github.io/, https://github.com/urchade/GLiNER). I don't see why this WhisperNER approach would be any better than doing ASR with whisper and then applying one of these NER models.

danielcampos93 · 9 months ago

This works better because it gives a secondary set of conditions for which the decoder (generating text) is conditioning its generation. Assume instead of their demo you are doing Speech2Text for Oncologists. Out of the Box Whisper is terrible because the words are new and rare, especially in YouTube videos. If you just run ASR through it and run NER, it will generate regular words over cancer names. Instead, if you condition generation on topical entities the generation space is constrained and performance will improve. Especially when you can tell the model what all the drug names are because you have a list (https://www.cancerresearchuk.org/about-cancer/treatment/drug...)

danielcampos93 commented on The withering dream of a cheap American electric car wsj.com/business/autos/th... · Posted by u/voisin

wannacboatmovie · 10 months ago

> coworker of mine just spent $100k on a regular old pickup truck

> It gets like 11 mpg and uses the 92 octane fuel.

I understand hating on pickup trucks is an easy way to farm upvotes on HN, but there is no 'regular pickup truck' in existence that gets 11 mpg. The closest that comes to that is the F-150 Raptor with turbocharged V8 which is a preposterous performance vehicle with a racing engine. It is a luxury item. Yet for some reason we don't criticize people with the same disdain who buy and drive sports cars which get as bad or even worse mpg. I guess the Lambo drivers never need to haul lumber.

The F-150 is also offered in hybrid (which gets > double that mpg) and all electric drivetrains.

I will make the equally presumptuous assumption that since you've narrowed your choices to "Prius or Prius" you harbor some grudges against pickup owners.

danielcampos93 · 10 months ago

https://www.youtube.com/watch?v=ecnS1Ygf0o0 I've been waiting for the chance to use this

danielcampos93 commented on Molmo: a family of open multimodal AI models molmo.allenai.org/blog... · Posted by u/jasondavies

danielcampos93 · a year ago

Not mentioned in their blog posts but on the model cards on huggingface: "Molmo 72B is based on Qwen2-72B and uses OpenAI CLIP as vision backbone. Molmo-72B achieves the highest academic benchmark score and ranks second on human evaluation, just slightly behind GPT-4o." Others are based on Qwen 7B. What happened to the Olmo chain?