LlamaIndex is building a platform for AI agents that can find information, synthesize insights, generate reports, and take actions over the most complex enterprise data.
We are seeking an exceptional engineer to join our growing LlamaParse team. Will work at the intersection of document processing, machine learning, and software engineering to push the boundaries of what's possible in document understanding. As a key member of a focused team, will have significant impact on our product's direction and technical architecture.
We are also hiring for a range of other roles, see our career page:
- Backend Software Engineer
- Forward Deploy Engineer
- Founding AI Engineer
- Open Source Engineer Python
- Founding Lead Product Manager
- Platform Engineer
- Senior Developer Relation Engineer
- Senior / Staff Backend Engineer
- Product Marketing Manager
However this come at a high cost in token and latency, but result in way better parse quality. Hopefully with new model this can be improved.
The hard part is to prevent the model ignoring some part of the page and halucinations (see some of the gpt4o sample here like the xanax notice:https://www.llamaindex.ai/blog/introducing-llamaparse-premiu...)
However this model will get better and we may soon have a good pdf to md model.
- An image will take 10x token on gpt-4o-mini vs gpt-4.
- On gemini 2.5 pro output token are token except if you are using structure output, then all character are count as a token each for billing.
- ...
Having the price per token is nice, but what is really needed is to know how much a given query / answer will cost you, as not all token are equals.