Wiring LM Studio to Notepad++

As developers and power users, we’ve grown accustomed to leaning on automation to help us refactor messy code, troubleshoot obscure errors, or structure text. But sending proprietary source code, internal logs, or sensitive text data up to cloud-hosted servers comes with serious privacy risks.

What if you could run a completely private, state-of-the-art LLM locally on your own computer and interact with it directly inside your text editor?

By pairing Notepad++ with the local LLM cluster engine LM Studio, you can do exactly that. You can trigger high-performance AI models with a single keyboard shortcut—zero cloud required, zero subscription fees, and 100% offline security.

Here is a comprehensive guide to setting up a zero-friction, private AI development workstation using either a pre-compiled plugin or your own custom open-source script.

Prerequisites Checklist

Before diving in, make sure you have the official, secure tools ready. (Avoid third-party mirrors to ensure your setup remains clean):

Notepad++ (v8.6.4+): Download the latest stable 64-bit release from the Official Notepad++ Downloads Page.
LM Studio: Download the application installer for your OS directly from the Official LM Studio Homepage.
NppOpenAI Plugin (v0.4.016+): Developed by Krazal, you can install it automatically via the built-in Notepad++ Plugins Admin panel, or download the release packages manually from the NppOpenAI GitHub Repository.

Note: If you are working on a corporate-managed machine, ensure compliance with your organization's local software execution policies before installing local server engines.

Step 1: Fire Up the Local Server in LM Studio

LM Studio features a built-in local server that perfectly mimics the OpenAI API structure. This allows tools designed for the cloud to interact directly with your local machine.

Open LM Studio and search for your preferred model.
Head to the Local Server tab on the left-side navigation panel (the server rack/network icon).
Select your downloaded model from the dropdown menu at the top and click Start Server.
By default, your local server will sit quietly and listen on your machine's port: http://localhost:1234

[Image: Lawrenz, 2026.]

Step 2: Choose Your Destiny (The Plugin vs. The DIY Script)

I am presenting you with two choices for connecting Notepad++ to your offline AI engine. Choose the one that best fits your need and complexity level.

Option A: Ready-to-Go Plugin (NppOpenAI)

If you want a native, compiled C++ framework that handles configurations via a simple .ini file, this is the easiest route.

Open Notepad++ and go to Plugins > Plugins Admin...
Search for NppOpenAI, check the box, and click Install.
(If setting up manually on a strictly offline machine, drop the NppOpenAI.dll and its supporting network binaries like libcurl.dll and cacert.pem directly into C:\Program Files\Notepad++\plugins\NppOpenAI\). Download this directly if it is not present in the Plugins available.
Go to Plugins > NppOpenAI > Edit Config and adjust your [API] settings to point to your local machine:

[Image: Lawrenz, 2026.]

Ini, TOML

[API]
secret_key=lm-studio
api_url=http://localhost:1234/v1/
route_chat_completions=chat/completions
response_type=openai
model=local-model
temperature=0.6
show_reasoning=0  # Set to 1 to see the model's <think> tags

[PLUGIN]
keep_question=0   # 0 replaces selected text; 1 appends the AI answer below it

# Replace 'local-model' with your loaded model ID if necessary.

Option B: Transparent DIY Script (Python)

If you are hesitant about running pre-compiled binary files from GitHub, or if you want absolute control over how your text is formatted, you can write your own connection script in Python.

Go to Plugins > Plugins Admin..., search for the Python Script plugin, and click Install.
Once Notepad++ restarts, go to Plugins > Python Script > New Script.
Name your file LMStudio_Bridge.py and paste the following Python 3 compliant macro:

Python

import urllib.request
import json

# Grab the text currently highlighted by your cursor
selected_text = editor.getSelText()

if len(selected_text.strip()) > 0:
    url = "http://localhost:1234/v1/chat/completions"
    
    # Define your custom payload and developer system instructions
    payload = {
        "model": "local-model",
        "messages": [
            {"role": "system", "content": "You are a precise coding assistant inside Notepad++. Optimize, refactor, or explain the code provided."},
            {"role": "user", "content": selected_text}
        ],
        "temperature": 0.6
    }
    
    try:
        # Send the HTTP POST request to your local LM Studio server
        data = json.dumps(payload).encode('utf-8')
        req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'})
        
        with urllib.request.urlopen(req) as response:
            result = json.loads(response.read().decode('utf-8'))
            
        # Extract the text and cleanly replace the selection in the editor
        ai_output = result['choices'][0]['message']['content']
        editor.replaceSel(ai_output)
        
    except Exception as e:
        # Fail gracefully if the LM Studio server isn't running
        print(f"Error connecting to LM Studio: {str(e)}")

To tie your script to a hotkey: Go to Plugins > Python Script > Configuration.... Move LMStudio_Bridge.py into the Menu Items section and restart the app. You can now open Settings > Shortcut Mapper > Plugin commands and bind it to a layout like Ctrl + Shift + L.

Step 3: Revealing the Hidden Chain of Thought

Modern local models utilize aggressive, internal reasoning loops. They will naturally output their internal "stream of consciousness" wrapped in <think>...</think> tags before generating the final answer.

If using the NppOpenAI plugin: You can set show_reasoning=1 in your .ini file to see the thinking process stream straight into your document, or show_reasoning=0 to have LM Studio strip the tags away and only pass the pristine code back to Notepad++.
If using the Python Script: The script above automatically bypasses the thinking tags when returning the choice payload, keeping your code workspace entirely clean by default.

[Image: Lawrenz, 2026.]

Multi-Purpose Practical Use Cases

Whether you are using the native plugin or your custom script, your workflow is live. Highlight any chunk of text and hit your hotkey shortcut. Here is how you can use this powerhouse across different disciplines:

1. Web Development & UI Prototyping

Instead of hunting down documentation or writing repetitive boilerplate, draft your structural intent in a comment:

Highlight it, press your shortcut, and watch clean, modern HTML stream directly into your tab.

2. Offline Log Analysis & Data Sanitization

If an application crashes or a server throws a cryptic stack trace, copy the messy log into an empty Notepad++ document. Add a quick instruction like: "Analyze this stack trace, find the core exception point, and recommend a resolution." Because everything stays on your machine, you never leak sensitive infrastructure data or IP addresses to a public server.

3. Automated Documentation Macro

You can hardcode specific, recurring macros into the bottom of your NppOpenAI.ini file for fast styling or code review:

Ini, TOML

[Prompt:readme]
Instruction=Generate a professional, fully-detailed GitHub README.md markdown file based on the selected project notes or code structure.

Choose the Right Brain For Your System Rig

Because software capabilities shift constantly, you should scale the model you load into LM Studio based on what your local hardware can handle comfortably:

8GB - 16GB RAM / Laptops: Run highly optimized 3B to 7B parameter variants (e.g., standard Llama 3, Phi-3, or mini Qwen models). These compute in seconds without locking up your system memory.
16GB - 32GB RAM + Dedicated GPU: Load comprehensive 8B to 14B logic models (such as Qwen 2.5-Coder 7B/14B or DeepSeek-Coder 7B/14B variants). They punch far above their weight class for complex syntax, nested logic, and reasoning tasks.
64GB+ RAM / Heavy VRAM: If your hardware allows it, scale all the way up to specialized 32B or 70B parameters for production-grade software architecture modeling.

#BuildingSol #SoftwareEngineering #WebDevelopment #DeveloperProductivity #CodingShortcuts #NotepadPlusPlus #PythonScript #DevCommunity #LMStudio #LocalLLM #PrivateAI #OfflineAI #OpenSourceLLM #SelfHosted #DataPrivacy #CodingLife #SetupInspiration #ProgrammerProblems #TechTutorial #AILocal #VibeCoding #WorkspaceGoals