32 lines
638 B
Markdown
32 lines
638 B
Markdown
|
# Yura LLM Client for Katya server
|
||
|
|
||
|
Part of project with as target replacing the native ollama protocol. This protocol supports streaming and is usable trough https and it is possible to directly attach a web client to the backend.
|
||
|
|
||
|
## Install
|
||
|
```bash
|
||
|
pip install -e .
|
||
|
```
|
||
|
|
||
|
## Build
|
||
|
```bash
|
||
|
make build
|
||
|
```
|
||
|
|
||
|
## Command line usage
|
||
|
```bash
|
||
|
yura ws://[host]:[port]/[path]/
|
||
|
```
|
||
|
|
||
|
## Python
|
||
|
```python
|
||
|
import asyncio
|
||
|
from yura.client import AsyncClient
|
||
|
|
||
|
async def communicate():
|
||
|
client = AsyncClient("ws://[host]:[port]/[path]/")
|
||
|
async for response in client.chat("Your prompt"):
|
||
|
print(response)
|
||
|
|
||
|
asyncio.run(communicate())
|
||
|
```
|