Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Realtime API support #3722

Draft
wants to merge 23 commits into
base: master
Choose a base branch
from
Draft

feat: Realtime API support #3722

wants to merge 23 commits into from

Conversation

mudler
Copy link
Owner

@mudler mudler commented Oct 3, 2024

Description

This PR fixes #3714

And also covers #191

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

Copy link

netlify bot commented Oct 3, 2024

Deploy Preview for localai ready!

Name Link
🔨 Latest commit a784372
🔍 Latest deploy log https://app.netlify.com/sites/localai/deploys/67363cd0d2697e000883819b
😎 Deploy Preview https://deploy-preview-3722--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mudler mudler force-pushed the feat/realtime branch 3 times, most recently from 5435a07 to 2db3f3d Compare October 9, 2024 10:57
Copy link

github-actions bot commented Oct 9, 2024

yamllint Failed

Show Output
::group::gallery/arch-function.yaml
::error file=gallery/arch-function.yaml,line=66,col=22::66:22 [new-line-at-end-of-file] no new line character at the end of file
::endgroup::

Workflow: Yamllint GitHub Actions, Action: __karancode_yamllint-github-action, Lint: gallery

@mudler
Copy link
Owner Author

mudler commented Oct 14, 2024

Just for reference, openai-realtime-console seems quite nice for testing things out especially at this stage, I've opened up a PR upstream to include a Dockerfile and instructions on how to use it with a local server: openai/openai-realtime-console#59

core/http/app.go Fixed Show fixed Hide fixed
@mudler mudler force-pushed the feat/realtime branch 2 times, most recently from 1c61aad to 22579bc Compare October 18, 2024 14:06
@mudler mudler force-pushed the feat/realtime branch 2 times, most recently from 2886390 to deff060 Compare October 30, 2024 15:10
@mudler mudler changed the title WIP: realtime API stub feat: Realtime API support Oct 31, 2024
@mudler mudler force-pushed the feat/realtime branch 4 times, most recently from 7a9d6e8 to ee4ae33 Compare November 6, 2024 17:35
@mattkanwisher
Copy link
Contributor

whats best option here if we want to contribute just make forks of the branch and PRS against this?

@mudler
Copy link
Owner Author

mudler commented Nov 7, 2024

whats best option here if we want to contribute just make forks of the branch and PRS against this?

Yes, that would work just fine!

@mudler mudler force-pushed the feat/realtime branch 2 times, most recently from 60579fb to 4034562 Compare November 7, 2024 22:54
@mudler mudler force-pushed the feat/realtime branch 6 times, most recently from 818122a to a1da931 Compare November 12, 2024 18:12
@mudler
Copy link
Owner Author

mudler commented Nov 13, 2024

Currently at creating the VAD backend with silero, attach it to the compilation process and to the binary releases

Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Testing with:

```yaml
name: gpt-4o
pipeline:
 tts: voice-it-riccardo_fasol-x-low
 transcription: whisper-base-q5_1
 llm: llama-3.2-1b-instruct:q4_k_m
```

Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
One is anyToAny models that requires a VAD model, and one is
wrappedModel that requires as well VAD models along others in the
pipeline.

Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
@mudler
Copy link
Owner Author

mudler commented Nov 14, 2024

mh. things are in the good direction but still VAD isn't right, it detects the start of the conversation, but can't detect the end segment yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for realtime API
2 participants