Readit News logoReadit News
bguberfain commented on TradeExpert, a trading framework that employs Mixture of Expert LLMs   arxiv.org/abs/2411.00782... · Posted by u/wertyk
bguberfain · 3 months ago
So they used a LLM with knowledge cut in mid 2023 to evaluate 2023? Seems like a classic leakage problem.

From paper: "testing set: January 1, 2023, to December 31, 2023"

From the Llama 2 doc: "(...) some tuning data is more recent, up to July 2023."

bguberfain commented on LLM function calls don't scale; code orchestration is simpler, more effective   jngiam.bearblog.dev/mcp-l... · Posted by u/jngiam1
bguberfain · 3 months ago
I think that there may be another solution for this, that is the LLM write a valid code that calls the MCP's as functions. See it like a Python script, where each MCP is mapped to a function. A simple example:

  def process(param1, param2):
     my_data = mcp_get_data(param1)
     sorted_data = mcp_sort(my_data, by=param2)
     return sorted_data

bguberfain commented on NASA keeps ancient Voyager 1 spacecraft alive with Hail Mary thruster fix   theregister.com/2025/05/1... · Posted by u/nullhole
mek6800d2 · 4 months ago
Part of this excellent movie revolved around the months-long shutdown of the 70-meter antenna at the Deep Space Network station in Canberra, Australia. Coincidentally, the new JPL press release about Voyager 1's thrusters also details a new months-long shutdown (May 2025-Feb 2026) of that same antenna for more upgrades. It's the only antenna that can transmit to Voyager 2, which flew south of the ecliptic after its Neptune flyby. The DSN stations in Spain and California can still transmit to Voyager 1, which flew north of the ecliptic after its Saturn flyby. (Todd Barber, quoted in the The Register article and in JPL's press release, appears in the movie.)
bguberfain · 4 months ago
Not available in my country :(
bguberfain commented on Transformer Lab   transformerlab.ai/... · Posted by u/jonbaer
woadwarrior01 · 5 months ago
Glad to see that the code[1] is AGPL-3.0 licensed.

[1]: https://github.com/transformerlab/transformerlab-app

bguberfain · 5 months ago
Unfortunately, it uses Miniconda, which does not allow usage in companies with more than 200 employees. I think it conflicts with AGPL license. I created a PR to fix that.
bguberfain commented on Gemma 3 Technical Report [pdf]   storage.googleapis.com/de... · Posted by u/meetpateltech
alekandreev · 6 months ago
Picking model sizes is not an exact science. We look for sizes that will fit quantized on different categories on devices (e.g., low-end and high-end smartphone, laptops and 16GB GPUs, and bigger GPUs/TPUs). We also want the ratio of model width to depth (number of layers) to be consistently around 90, which we found works best.

The models are trained with distillation from a bigger teacher. We train them independently, but for v3 we have unified the recipes for 4B-27B, to give you more predictably when scaling up and down to different model sizes.

bguberfain · 6 months ago
Can you provide more information about this “bigger teacher” model?
bguberfain commented on GPT-4.5   openai.com/index/introduc... · Posted by u/meetpateltech
ekojs · 6 months ago
> Because of this, we’re evaluating whether to continue serving it in the API long-term as we balance supporting current capabilities with building future models.

Seems like it's not going to be deployed for long.

$75.00 / 1M tokens for input

$150.00 / 1M tokens for output

That's crazy prices.

bguberfain · 6 months ago
Until GPT-4.5, GPT-4 32K was certainly the most heavy model available at OpenAI. I can imagine the dilemma between to keep it running or stop it to free GPU for training new models. This time, OpenAI was clear whether to continue serving it in the API long-term.
bguberfain commented on LLMs can teach themselves to better predict the future   arxiv.org/abs/2502.05253... · Posted by u/bturtel
dantheman252 · 7 months ago
Danny here, one of the authors of this paper. If anyone has any questions or anything feel free to AMA!
bguberfain · 7 months ago
Any chance you could release the dataset to the public? I imagine NewsCatcher and Polymarket might not agree..

u/bguberfain

KarmaCake day126November 20, 2017
About
meet.hn/city/br-Rio-de-Janeiro Data Scientist @ Petrobras
View Original