Local AI Provider

Run AI models locally with Ollama, LM Studio, LocalAI, or vLLM - complete privacy, no cloud required.

Features

Local AI Model Support - Run AI models entirely on your own hardware
Ollama Integration - Seamless integration with Ollama for easy model management
LM Studio Support - Connect to LM Studio's local inference server
LocalAI/vLLM Support - Support for LocalAI and vLLM backends
Connection Testing - Test connectivity to your local AI server
Model Discovery - Automatically discover available models from your local server
Test Chat - Test chat functionality directly from settings

Requirements

Requirement	Details
Dependencies	AICore
PHP Version	8.2+
Local Server	Ollama, LM Studio, LocalAI, or vLLM running locally
Hardware	Sufficient GPU/CPU for model inference

Installation

Enable via Admin Panel

Log in as administrator
Navigate to Settings > Addons
Find Local AI Provider and click Enable

Enable via Command Line

php artisan module:enable LocalAIProvider
php artisan migrate

note

AI Core must be installed and enabled before enabling this module. You must also have a local AI server (e.g., Ollama) installed and running on your hardware.

Configuration

Navigate to Local AI Provider > Settings (/localaiprovider/settings) to configure:

Server URL - Local server address (default: http://localhost:11434 for Ollama)
Connection Test - Test connectivity to your local AI server
Available Models - Discover and list models available on your local server
Test Chat - Send a test message to verify the integration works

Ollama Setup

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama3.2

# Start server (usually automatic)
ollama serve

LM Studio Setup

Download and install LM Studio from lmstudio.ai
Load your preferred model
Start the local server (default port: 1234)

Supported Backends

Backend	Default Port	Notes
Ollama	11434	Recommended for ease of use
LM Studio	1234	Good for experimentation
LocalAI	8080	OpenAI-compatible API
vLLM	8000	High-performance inference

Usage

Settings Page

Navigate to Local AI Provider > Settings (/localaiprovider/settings) to configure and manage your local AI server. The settings page is organized into four sections:

Server Configuration

Provider Type - Select your local AI server: Ollama, LM Studio, LocalAI, vLLM, or Text Generation WebUI
Server URL - The base URL of your local AI server (defaults vary by provider)
Default Model - The default model to use for requests (e.g., llama3.2, mistral, codellama)
API Key - Optional API key if your local server requires authentication

Generation Settings

Temperature - Controls randomness of responses (0 = deterministic, 2 = very creative). Default: 0.7
Max Tokens - Maximum number of tokens to generate (1-32000). Default: 2048
Top P - Nucleus sampling threshold (0-1). Default: 0.9
Top K - Top-k sampling value (1-100). Default: 40

Timeout Settings

Request Timeout - Maximum seconds to wait for an AI response (10-600). Default: 120
Connection Timeout - Maximum seconds to wait for server connection (1-60). Default: 10

Behavior Settings

Auto-Detect Server - Automatically detect which local AI server is running
Fallback to Cloud - Fall back to cloud providers (e.g., OpenAI) if the local server is unavailable
Enable Streaming - Enable streaming responses for better user experience

Testing Your Setup

From the settings page, you can:

Test Connection - Verify that your local AI server is reachable and responding
View Available Models - Discover models installed on your local server
Test Chat - Send a test message to confirm the full pipeline works

AI Core Integration

When settings are saved, the module automatically:

Registers itself as a provider in AI Core with zero cost per token
Syncs available models from your local server into AI Core's model list
Deactivates models in AI Core that are no longer present on your local server

After setup, you can assign local models to modules via AI Core > Module Configuration, just like any cloud provider.

Notes

Local models require significant hardware resources
No internet connection required after model download
All data stays on your servers - complete privacy
Performance depends on your hardware specifications
GPU acceleration recommended for larger models
The module registers as a provider addon in AI Core

Changelog: View version history

Features​

Requirements​

Installation​

Enable via Admin Panel​

Enable via Command Line​

Configuration​

Ollama Setup​

LM Studio Setup​

Supported Backends​

Usage​

Settings Page​

Server Configuration​

Generation Settings​

Timeout Settings​

Behavior Settings​

Testing Your Setup​

AI Core Integration​

Notes​