nihit-desai (u/nihit-desai)

nihit-desai commented on Show HN: Autolabel, a Python library to label and enrich text data with LLMs github.com/refuel-ai/auto... · Posted by u/nihit-desai

bomewish · 3 years ago

What can this do that the new ‘calling functions’ feature can’t? It seems to be roughly the same thing?

nihit-desai · 3 years ago

function calling, as I understand it, makes LLM outputs easier to consume by downstream APIs/functions (https://openai.com/blog/function-calling-and-other-api-updat...).

Autolabel is quite orthogonal to this - it's a library that makes interacting with LLMs very easy for labeling text datasets for NLP tasks.

We are actively looking at integrating function calling into Autolabel though, for improving label quality, and support downstream processing.

nihit-desai commented on Show HN: Autolabel, a Python library to label and enrich text data with LLMs github.com/refuel-ai/auto... · Posted by u/nihit-desai

devjab · 3 years ago

This is very interesting to me. We spent a significant time “labelling” data when I was in the public sector digitalisation. Basically what was done, was to do the LLM part manually and then have engines like this on top of it. Having used ChatGPT to write JSDoc documentation for a while now, and been very impressed with how good it is when it understands your code through good use of naming conventions, I’m fairly certain it’ll be the future of “librarian” styled labelling of case files.

But the key issue is going to be privacy. I’m not big on LLM, so I’m sorry if this is obvious, but can I use something like this without sending my data outside my own organisation?

nihit-desai · 3 years ago

Yep! I totally understand the concerns around not being able to share data externally - the library currently supports open source, self-hosted LLMs through huggingface pipelines (https://github.com/refuel-ai/autolabel/blob/main/src/autolab...), and we plan to add more support here for models like llama cpp that can be run without many constrains on hardware

nihit-desai commented on Show HN: Autolabel, a Python library to label and enrich text data with LLMs github.com/refuel-ai/auto... · Posted by u/nihit-desai

voz_ · 3 years ago

You just posted this here https://news.ycombinator.com/item?id=36384015

It's one thing to show HN / share, its another thing to spam it with your ads.

nihit-desai · 3 years ago

Hi!

The earlier post was a report summarizing LLM labeling benchmarking results. This post shares the open source library.

Neither is intended to be an ad. Our hope with sharing these is to demonstrate how LLMs can be used for data labeling, and get feedback from the community

nihit-desai commented on LLMs can label data as well as human annotators, but 20 times faster refuel.ai/blog-posts/llm-... · Posted by u/nihit-desai

poomer · 3 years ago

At work we were facing this dilemna. Our team is working on a model to detect fraud/scam messages, in production it needs to label ~500k messages a day at low cost. We wanted to train a basic gbt/BERT model to run locally but we considered using GPT-4 as an label source instead of our usual human labelers.

For us human labeling is suprisingly cheap, the main advantage of GPT-4 would be that it would be much faster, since scams are always changing we could general new labels regularly and be continuously retraining our model.

In the end we didn't go down that route, there were several problems:

- GPT-4 accuracy wasn't as good as human labelers. I believe this is because scam messages are intentionally tricky, and require a much more general understanding of the world compared to the datasets used in this article which feature simpler labeling problems. Also, I don't trust that there was no funny business going on in generating the results for this blog, since there is clear conflict of interest with the business that owns it.

- GPT-4 would be consistently fooled by certain types of scams whereas human annotators work off a consensus procedure. This could probably be solved in the future when there's a larger pool of other high-quality LLMs available, and we can pool them for consensus.

- Concern that some PII information gets accidentally sent to OpenAI, of course nobody trusts that those guys will treat our customers data with any level of appropriate ethics.

nihit-desai · 3 years ago

>> don't trust that there was no funny business going on in generating the results for this blog

All the datasets and labeling configs used for these experiments are available in our Github repo (https://github.com/refuel-ai/autolabel) as mentioned in the report. Hope these are useful!

nihit-desai commented on LLMs can label data as well as human annotators, but 20 times faster refuel.ai/blog-posts/llm-... · Posted by u/nihit-desai

natch · 3 years ago

>compare each of their performance against supposed ground truth labels.

Fixed it for you.

nihit-desai · 3 years ago

I mean, sure. For ground truth, we are using the labels that are part of the original dataset: * https://huggingface.co/datasets/banking77 * https://huggingface.co/datasets/lex_glue/viewer/ledgar/train * https://huggingface.co/datasets/squad_v2 ... (exhaustive set of links at the end of the report).

Is there some noise in these labels? Sure! But the relative performance with respect to these is still a valid evaluation