Skip to main content

Local AI Provider

Run AI models locally with Ollama, LM Studio, LocalAI, or vLLM - complete privacy, no cloud required.

Features

  • Local AI Model Support - Run AI models entirely on your own hardware
  • Ollama Integration - Seamless integration with Ollama for easy model management
  • LM Studio Support - Connect to LM Studio's local inference server
  • LocalAI/vLLM Support - Support for LocalAI and vLLM backends
  • Connection Testing - Test connectivity to your local AI server
  • Model Discovery - Automatically discover available models from your local server
  • Test Chat - Test chat functionality directly from settings

Requirements

RequirementDetails
DependenciesAICore
PHP Version8.2+
Local ServerOllama, LM Studio, LocalAI, or vLLM running locally
HardwareSufficient GPU/CPU for model inference

Installation

Enable via Admin Panel

  1. Log in as administrator
  2. Navigate to Settings > Addons
  3. Find Local AI Provider and click Enable

Enable via Command Line

php artisan module:enable LocalAIProvider
php artisan migrate
note

AI Core must be installed and enabled before enabling this module. You must also have a local AI server (e.g., Ollama) installed and running on your hardware.

Configuration

Navigate to Local AI Provider > Settings (/localaiprovider/settings) to configure:

  • Server URL - Local server address (default: http://localhost:11434 for Ollama)
  • Connection Test - Test connectivity to your local AI server
  • Available Models - Discover and list models available on your local server
  • Test Chat - Send a test message to verify the integration works

Ollama Setup

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama3.2

# Start server (usually automatic)
ollama serve

LM Studio Setup

  1. Download and install LM Studio from lmstudio.ai
  2. Load your preferred model
  3. Start the local server (default port: 1234)

Supported Backends

BackendDefault PortNotes
Ollama11434Recommended for ease of use
LM Studio1234Good for experimentation
LocalAI8080OpenAI-compatible API
vLLM8000High-performance inference

Usage

Settings Page

Navigate to Local AI Provider > Settings (/localaiprovider/settings) to configure and manage your local AI server. The settings page is organized into four sections:

Server Configuration

  • Provider Type - Select your local AI server: Ollama, LM Studio, LocalAI, vLLM, or Text Generation WebUI
  • Server URL - The base URL of your local AI server (defaults vary by provider)
  • Default Model - The default model to use for requests (e.g., llama3.2, mistral, codellama)
  • API Key - Optional API key if your local server requires authentication

Generation Settings

  • Temperature - Controls randomness of responses (0 = deterministic, 2 = very creative). Default: 0.7
  • Max Tokens - Maximum number of tokens to generate (1-32000). Default: 2048
  • Top P - Nucleus sampling threshold (0-1). Default: 0.9
  • Top K - Top-k sampling value (1-100). Default: 40

Timeout Settings

  • Request Timeout - Maximum seconds to wait for an AI response (10-600). Default: 120
  • Connection Timeout - Maximum seconds to wait for server connection (1-60). Default: 10

Behavior Settings

  • Auto-Detect Server - Automatically detect which local AI server is running
  • Fallback to Cloud - Fall back to cloud providers (e.g., OpenAI) if the local server is unavailable
  • Enable Streaming - Enable streaming responses for better user experience

Testing Your Setup

From the settings page, you can:

  • Test Connection - Verify that your local AI server is reachable and responding
  • View Available Models - Discover models installed on your local server
  • Test Chat - Send a test message to confirm the full pipeline works

AI Core Integration

When settings are saved, the module automatically:

  • Registers itself as a provider in AI Core with zero cost per token
  • Syncs available models from your local server into AI Core's model list
  • Deactivates models in AI Core that are no longer present on your local server

After setup, you can assign local models to modules via AI Core > Module Configuration, just like any cloud provider.

Notes

  • Local models require significant hardware resources
  • No internet connection required after model download
  • All data stays on your servers - complete privacy
  • Performance depends on your hardware specifications
  • GPU acceleration recommended for larger models
  • The module registers as a provider addon in AI Core

Changelog: View version history