Local AI Provider
Run AI models locally with Ollama, LM Studio, LocalAI, or vLLM - complete privacy, no cloud required.
Features
- Local AI Model Support - Run AI models entirely on your own hardware
- Ollama Integration - Seamless integration with Ollama for easy model management
- LM Studio Support - Connect to LM Studio's local inference server
- LocalAI/vLLM Support - Support for LocalAI and vLLM backends
- Connection Testing - Test connectivity to your local AI server
- Model Discovery - Automatically discover available models from your local server
- Test Chat - Test chat functionality directly from settings
Requirements
| Requirement | Details |
|---|---|
| Dependencies | AICore |
| PHP Version | 8.2+ |
| Local Server | Ollama, LM Studio, LocalAI, or vLLM running locally |
| Hardware | Sufficient GPU/CPU for model inference |
Installation
Enable via Admin Panel
- Log in as administrator
- Navigate to Settings > Addons
- Find Local AI Provider and click Enable
Enable via Command Line
php artisan module:enable LocalAIProvider
php artisan migrate
note
AI Core must be installed and enabled before enabling this module. You must also have a local AI server (e.g., Ollama) installed and running on your hardware.
Configuration
Navigate to Local AI Provider > Settings (/localaiprovider/settings) to configure:
- Server URL - Local server address (default:
http://localhost:11434for Ollama) - Connection Test - Test connectivity to your local AI server
- Available Models - Discover and list models available on your local server
- Test Chat - Send a test message to verify the integration works
Ollama Setup
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull llama3.2
# Start server (usually automatic)
ollama serve
LM Studio Setup
- Download and install LM Studio from lmstudio.ai
- Load your preferred model
- Start the local server (default port: 1234)
Supported Backends
| Backend | Default Port | Notes |
|---|---|---|
| Ollama | 11434 | Recommended for ease of use |
| LM Studio | 1234 | Good for experimentation |
| LocalAI | 8080 | OpenAI-compatible API |
| vLLM | 8000 | High-performance inference |
Usage
Settings Page
Navigate to Local AI Provider > Settings (/localaiprovider/settings) to configure and manage your local AI server. The settings page is organized into four sections:
Server Configuration
- Provider Type - Select your local AI server: Ollama, LM Studio, LocalAI, vLLM, or Text Generation WebUI
- Server URL - The base URL of your local AI server (defaults vary by provider)
- Default Model - The default model to use for requests (e.g.,
llama3.2,mistral,codellama) - API Key - Optional API key if your local server requires authentication
Generation Settings
- Temperature - Controls randomness of responses (0 = deterministic, 2 = very creative). Default: 0.7
- Max Tokens - Maximum number of tokens to generate (1-32000). Default: 2048
- Top P - Nucleus sampling threshold (0-1). Default: 0.9
- Top K - Top-k sampling value (1-100). Default: 40
Timeout Settings
- Request Timeout - Maximum seconds to wait for an AI response (10-600). Default: 120
- Connection Timeout - Maximum seconds to wait for server connection (1-60). Default: 10
Behavior Settings
- Auto-Detect Server - Automatically detect which local AI server is running
- Fallback to Cloud - Fall back to cloud providers (e.g., OpenAI) if the local server is unavailable
- Enable Streaming - Enable streaming responses for better user experience
Testing Your Setup
From the settings page, you can:
- Test Connection - Verify that your local AI server is reachable and responding
- View Available Models - Discover models installed on your local server
- Test Chat - Send a test message to confirm the full pipeline works
AI Core Integration
When settings are saved, the module automatically:
- Registers itself as a provider in AI Core with zero cost per token
- Syncs available models from your local server into AI Core's model list
- Deactivates models in AI Core that are no longer present on your local server
After setup, you can assign local models to modules via AI Core > Module Configuration, just like any cloud provider.
Notes
- Local models require significant hardware resources
- No internet connection required after model download
- All data stays on your servers - complete privacy
- Performance depends on your hardware specifications
- GPU acceleration recommended for larger models
- The module registers as a provider addon in AI Core
Changelog: View version history