Readit News logoReadit News
amks commented on Using GPT-4 Vision with Vimium to browse the web   github.com/ishan0102/vimG... · Posted by u/wvoch235
snake_doc · 2 years ago
Ah, very similar to Adept’s[1] concept? Though, their product seems not yet ready.

[1] https://www.adept.ai/

amks commented on Fuyu-8B: A multimodal architecture for AI agents   adept.ai/blog/fuyu-8b... · Posted by u/averylamp
abrichr · 2 years ago
Thank you for the release!

What can you tell us about this:

> Our internal models (based on Fuyu) have extra capabilities related to our product. In particular,

> 1. They can reliably perform OCR on high-resolution images

> 2. They can do fine-grained localization of text and UI elements within those images

> 3. They can answer questions about images of UIs

Is this just a matter of additional fine tuning, or are there architectural differences?

amks · 2 years ago
Even with experiments with just adding additional fine-tuning, we've seen models gain these capabilities!
amks commented on Persimmon-8B   adept.ai/blog/persimmon-8... · Posted by u/jgershen
thewataccount · 3 years ago
Awesome! I applaud everyone training new models and attempting different techniques!

I'm concerned about the current download's availability - its two URLs to some object storage. I find that these go dark rather quickly for many different reasons (accidentally moving it, bandwidth limits, deleting it later, etc).

I'm curious if there's a reason it's not also hosted on huggingface? I'm not saying they're the best place, but redundancy is good, most models have entries there, they have a very good cdn, and isn't as likely to go dark accidentally.

amks · 3 years ago
We're working on it!

u/amks

KarmaCake day32September 7, 2023
About
ML/PL research @ Berkeley; AI Agents @ Adept
View Original