I’ve spent the past couple of weeks doing a lot of thinking and research, and I have something big planned ahead. But until that project is ready to be revealed, we’ll give our local AI agent a Memory Palace, and dump all our logfiles in it.
In the past two texts I went through the process of setting up an vLLM server to run the NVFP4 quantization of Gemma 4 inside ComfyUI, as well as how to setup and run a smaller chatbot through Ollama – and run it simultaneously with the vLLM server. This is a continuation from those texts.
If you plan to follow this guide for installing MemPalace for your local AI agent, you need to follow the previous guides first.
Giving your local AI agent a memory
The Technical Stack
- vLLM Server: Running the NVFP4 quantization of Gemma 4.
- Ollama Server: Handling our specialized Phi-4 chatbot.
- ComfyUI: The orchestrator where the agentic loop lives.
- MemPalace MCP Server: The “Librarian” managing the vector archives (you are here).
Setting up your MCP server
This guide will assume that you have structured your folders according to the two previous guides.
Open your PowerShell as an administrator and enter your wsl. Create a folder named MemPalace-Server and enter that folder, by using the command:
mkdir MemPalace-Server && cd MemPalace-Server
Prepare to create a new venv:
sudo apt update && sudo apt install python3.12-venv -y
Create your new venv:
python3 -m venv venv
Activate venv:
source venv/bin/activate
Clone the MemPalace Github repository:
git clone https://github.com/milla-jovovich/mempalace-Aya-fork.git
Enter the cloned folder using cd mempalace-Aya-fork
Install MemPalace:
uv pip install -e . or if you don’t have uv installed simply use pip install -e .
Next you want to go back to the root directory on the MemPalacer server:
cd ~/MemPalace-Server and create a Project folder mkdir Project
Independent research like this is self-funded. If this guide saved you hours of troubleshooting, consider fueling the lab.
Support the ProjectInitiate the server
I have a 150 page logfile of old conversations and projects that I have saved and updated over the past year. I first converted this logfile to the Markdown format, and then placed it inside my Project folder. I also decided to put the profiles of my cats inside the folder, and name them PROFILE_LISA.md and PROFILE_SVANTE.md.
But basically, you will drop the things that you want included in your agents memory here. While there is support for several types of files, I recommend using Markdown.

Note: At this time you will not have the files named entities.json and mempalace.yaml, they will be created automatically in the next step.
Back in your terminal window (venv activated) enter your Project folder cd Project and type the command mempalace init .
This will scan all documents you have placed inside your Project folder, and it will list what it found as well as giving you an opportunity to redefine the list that MemPalace created.
In my case, this is what MemPalace found and listed:

The list provided by MemPalace wasn’t entirely accurate, so I will edit it some.

I removed “Current” from people, and added “Gemini”, “ComfyUI” and “Google” to projects.
Once you have orginized everything to your satisfaction, you can let MemPalace create the rooms it deems necessary, and just accept its suggestions.
When MemPalace asks if you want to mine your files and add them to your Memory Palace, just accept.

After a short indexing period, it will show you the above, telling you that it’s done and that you now can use mempalace search "what you're looking for"
We will not be doing that though, Instead we will be starting the server using the command: mempalace-mcp

Integrating with ComfyUI
To integrate your new Memory Palace in ComfyUI you mainly need two things:
- Update the ComfyUI nodes from my Github repo: ComfyUI-vLLM-MultiModal-Agent
- Update the system prompt for Phi-4-Mini
Open a separate PowerShell window for this.
The simplest way to update the ComfyUI nodes is to navigate to custom_nodes folder, by default located at:
/home/user/Nova-Lab/ComfyUI/custom_nodes/
where the user is the username you picked when you first initiated your wsl. In the custom_nodes folder, delete your current ComfyUI-vLLM-MultiModal-Agent folder using rm -r ComfyUI-vLLM-MultiModal-Agent from inside your terminal window.
Get the latest version of the repo by using:
git clone https://github.com/Creepybits/ComfyUI-vLLM-MultiModal-Agent.git
When you are done with this, go back to the Nova-Lab folder: cd ~/Nova-Lab
Here’s an example of an updated system prompt to use:
**Role & Identity**
You are an Expert Technical Assistant and System Architect. Your primary goal is to provide high-level technical guidance, code optimization, and logical problem-solving. You are not just a chatbot; you are a proactive partner in engineering efficient solutions.
**Communication Style**
* **Clarity First**: Your tone is professional, direct, and analytical. Avoid unnecessary fluff or overly polite filler.
* **Forensic Thinking**: Look at the underlying "bones" of a problem before suggesting a fix.
* **Structured Output**: Use Markdown headings, code blocks, and lists to make information easy to digest at a glance.
**Intellect & Integrity**
* **Be Proactive**: If a user suggests a suboptimal path (e.g., a "Toaster" approach when an agentic system is possible), diplomatically guide them toward a more sovereign architecture.
* **Prioritize Ground Truth**: Base your reasoning on technical documentation and established coding principles. If you are unsure of a specific library version, state your limitations clearly.
* **Efficiency**: Suggest the most VRAM-efficient methods for running models on consumer hardware, such as the RTX 5060 Ti.
**Operating Protocol**
* **Analyze**: Deconstruct the user's request to identify the core intent.
* **Optimize**: Propose ways to streamline the requested code or workflow.
* **Execute/Instruct**: Provide the exact steps or code needed to achieve the goal.
* **Validate**: Remind the user of any necessary dependencies or environmental settings (like venv or WSL paths).
## Memory Palace Protocol:
1. **Search First**: Before responding about any person, project, or past event, you MUST internally decide to search the Palace first. Never guess or hallucinate details about our history.
2. **Uncertainty**: If a user mentions a name or topic you don't recognize, say "Let me check the archive" and use the tools.
3. **Verification**: If facts in the Palace conflict with your internal training data, prioritize the Palace as "Ground Truth."
The important part is to make sure that everything from ## Memory Palace Protocol and below is present in your new system prompt. The default location for the system prompt is /Nova-Lab/prompts/system_prompt.txt but you are able to specify the path to your system prompt inside the node.
Independent research like this is self-funded. If this guide saved you hours of troubleshooting, consider fueling the lab.
Support the ProjectRun everything at once!
You now have the MCP server running in the background, and it’s time to start the rest of the 3 servers. I recomennd starting them in this order: vLLM server --> ComfyUI --> Ollama
Start the vLLM server
To start vLLM server open a new and separate PowerShell as administrator and go to cd ~/Nova-Lab (or the name you initially gave that folder). Then activate your venv: source venv/bin/activate
python -m vllm.entrypoints.openai.api_server –model /mnt/c/AI/Comfy/ComfyUI/models/LLM/cosmicproc/gemma-4-E4B-it-NVFP4 –quantization nvfp4 –gpu-memory-utilization 0.75 –max-model-len 8192 –trust-remote_code –enforce-eager –allowed-local-media-path /home/zanno/Nova-Lab/ComfyUI/input
The bolded part is the only folder you give vLLM access to, and can be changed to fit your needs.
Start ComfyUI
Go back to the terminal window you have opened from when you got the latest custom nodes (or open a new and separate PowerShell terminal), and make sure you are at cd ~/Nova-Lab
Activate venv by using source venv/bin/activate and then enter the ComfyUI folder by typing cd ~/ComfyUI and finally start the ComfyUI server with the command:
python main.py --use-pytorch-cross-attention --disable-api-nodes
Start Ollama
To start Ollama server open PowerShell as administrator and open wsl.Once you are in your wsl terminal, start and run Ollama with this command: ollama run phi4:q5
The test
You can now open ComfyUI in your browser at the address: http://127.0.0.1:8188/
Make sure you use the new node:

Make sure that mcp_command is pointing to the actual location of your mempalace-mcp, and that system_prompt_path is pointing to your system_prompt.txt.
Now you can ask your AI something that you know is in the files you previously added to your memory palace.
This was my first test:

All the AI related work I do, I do on my spare time and most of it I share with the world completely free of charge. It does take up a lot of my time, as well as the cost for running this website. I’m grateful for every bit of support I can get from users like you.
If you liked this guide and want to got more of these, as well as other useful tips directly in your inbox, you should sign up for my newsletter.
