Readit News logoReadit News
Posted by u/kfeeney 2 months ago
Show HN: Leilani – SIP client that streams call audio to the OpenAI Realtime APIleilani.dev...
We built a SIP user agent that registers to any PBX just like a soft-phone and streams audio to OpenAI’s real-time API. No SIP trunking, call forwarding or unintegrated voice systems. Leilani behaves like a normal extension… because it’s just a normal SIP extension.

How it works

- Implements bog standard SIP over TCP to connect just like a normal desk phone or spftphone. - Streams RTP (mu-law) bidirectionally to OpenAI’s realtime API - Handles function calls for external actions (webhooks)

Use cases

- After-hours auto-attendant - Voicemail/intent capture with structured output - Internal system lookup (CRM, scheduling, ticket creation, etc.) via function calls - Replaces IVR’s with natural conversation

Why?

Most AI voice systems expect you to hand over call routing, use their SIP trunk, or are an entirely separate voice stack all together. Every company (nearly) already has a SIP PBX, so we thought operating as a normal SIP extension was the simplest integration point.

Tech Stack

- The backend is built in asynchronous Rust. - We connect to the realtime API using websockets rather than SIP trunking or WebRTC - Hosted on a simple AWS EC2 instance

Limitations / gotchas

- Currently only supports SIP over TCP, we have TLS support coming soon - There are some NAT traversal assumptions (we behave like a softphone) - Latency depends on PBX and model RTT and audio frame sizes (currently seeing ~300ms across most deployments) - You still need your own OpenAI key. Could be a positive or negative, depends how you look at it :)

Link https://leilani.dev

No comments