Thalis AI Stack
The Thalis AI stack is an open-source platform designed to bring your original characters to virtual life. The platform is modular and self-hosted, allowing you to craft unique and immersive character experiences.
## Key Features
- Self-hosting capabilities: Ensure that all data remains under your control, minimizing the risk of data breaches and unwanted surveillance.
- Modular architecture: Enables easy updates and patches, safeguarding your system against potential vulnerabilities.
- Custom models and workflows: Support for tailored character personalities, appearances, and traits.
- Multi-GPU support: Faster processing and distributed deployments for availability and scalability.
## Generators and Tools
The Thalis AI stack uses a variety of open-source tools for inference, including:
- SpeechT5 for high-quality text-to-speech synthesis
- ComfyUI for high-resolution image generation using Stable Diffusion and Flux models
- Ollama for streaming text generation
## Getting Started
To learn more about the Thalis AI stack, including its architecture, licensing, and community support, please visit our Github project page. Here, you’ll find detailed documentation, tutorials, and resources to help you get started with creating your own interactive character experiences.
## Comparison with Other Platforms
Compared to other platforms and the individual tools, the Thalis AI stack offers more ways to interact with your characters, while keeping your data private and secure. Support for custom LoRA networks allows you to customize your characters in ways that are not possible with other platforms.
| Category | Thalis AI | Popular Chatbot Platforms | ComfyUI | Ollama | open-webui |
|---|---|---|---|---|---|
| Self-hosted | Yes | No | Yes | Yes | Yes |
| Privacy & data control | Yes | No | Yes | Yes | Yes |
| Completely open-source | Yes | No | Yes | Yes | Yes |
| Support for desktop GPUs | Yes | No | Yes | Yes | Yes |
| Text chat | Yes | Yes | No (1) | Yes (2) | Yes |
| Audio generation | Yes | Yes | Yes (5) | No | Yes |
| Image generation | Yes | Yes | Yes | No | Yes |
| Per-character image models | Yes | No | Yes (3) | No (3) | No |
| Per-character text models | Yes | Yes | No (3) | Yes (3) | Yes (4) |
| Character LoRA networks | Yes | No | Yes | Yes | No |
| Dynamic workflow generator | Yes | No | No | No | No |
| Image prompt generator | Yes | No | Yes (5) | No | No |
| Total | 12/12 | 4/12 | 8/11 | 6/11 | 6/11 |
Notes:
- ComfyUI has custom nodes that support text generation, but I am not aware of any that allow bidirectional chat
- Ollama supports text chat through the command-line tools, but it does not include a web UI
- ComfyUI and Ollama do not have character profiles, but they do allow you to choose the model for each chat or image
- open-webui does not have character profiles, but it allows you to select the model for each chat
- ComfyUI supports audio generation and image prompt generation with custom nodes
If you have another multi-modal AI stack that you would like to have included in this comparison, please let us know. Other self-hosted stacks are especially welcome, we would love to support other privacy-conscious open-source projects.