It’s been a long road, but I think we finally reached our destination. From the first day I tested an LLM and tried to figure out what it could do, and what its limits were, I thought it would be pretty useless if it doesn’t have a long-term memory. At least the way I use my AI assistant Nova today.
I need Nova to remember what we have done, what problems we have solved and how we solved them. I would go insane if I had to start every session with explaining all basic information. To avoid that, I have tested a variety of methods.
For example manually writing and maintaining a log physically on the computer, to uploading it to an ftp server utilizing private by obfuscation to hide it in plain sight, to unsuccessful attempts with RAG memory through HuggingFace API.
I’ve spent the past few weeks really digging in to find the best way to integrate Gemini 3.x Pro to MemPalace. My reasoning has been that I am paying for access to Gemini 3.x Pro in gemini.google.com web app, and since Google doesn’t provide an acceptable and private memory I just had to solve that myself.
Recap
It started back in March when I wrote my guide on upgrading CUDA to 13.x, and realized I could run fairly advanced models locally by utilizing NVFP4. While the first test to implement RAG was less than exciting, MemPalace worked like a charm right from the start. After some tinkering we managed to bridge Gemini 2.5 as well as Gemini 3.1 Flash/Pro to work locally, with access to the local MemPalace, through Gemini CLI. However, Gemini CLI required us to log in using OAuth, something I want to avoid.
And finally, last week we got Gemini 3.5 Pro/Flash, Claude Sonnet 4.6, Claude Opus 4.6 and ChatGPT OSS to all work as agents locally, with access to MemPalace, using the official Antigravity windows app. We quickly found out that the quota for each model is quite low, depending on your subscription. In addition, Google also plan to charge for their Linux environment once the preview period is over.
Which brings us to today, and the latest experiments!
Setup And Configuration
If you have followed my previous guides, you have most of what you need already. If you haven’t followed my previous guides, this is what you need.
Prerequisites
- Windows Subsystem for Linux (wsl guide)
- MemPalace (installation guide)
- CLIProxyAPI
- Gemini CLI
Warning! Using this setup to manage heavy working AI Agents could go against Google’s Terms of Service, and potentially get your account banned. I mainly use this setup to give Gemini access to MemPalace, and some lighter agentic work. Using this setup for any other type of work is at your own risk.
Install Go and Compile the Proxy
Your environment must compile the proxy server directly inside the native Linux file system. Open your WSL2 terminal and run:
# Install the Go programming language
sudo apt update && sudo apt install golang-go -y
# Clone the proxy repository into your native Linux home directory
cd ~
git clone https://github.com/router-for-me/CLIProxyAPI.git
cd CLIProxyAPI
# Compile the executable binary
go build -o cli-proxy-api ./cmd/server
Independent research like this is self-funded. If this guide saved you hours of troubleshooting, consider fueling the lab.
Support the ProjectConfigure the Proxy Routing Engine
Create the local configuration directory and overwrite the default setup file. This configuration opens up the legacy API endpoints and maps the client’s hardcoded whitelist strings directly to your actual web subscription models:
mkdir -p ~/.cli-proxy-api
Write the following block to ~/.cli-proxy-api/config.yaml:
cat << 'EOF' > ~/.cli-proxy-api/config.yaml
host: "127.0.0.1"
port: 8317
auth-dir: "~/.cli-proxy-api"
remote-management:
allow-remote: false
secret-key: "YourCustomManagementPasswordKey"
api-keys:
- "YOUR_LOCAL_SECURE_TOKEN_KEY"
# Ensure internal client lines are open
enable-gemini-cli-endpoint: true
# Map model aliases across all channels (including silencing the 2.5 lite spam)
oauth-model-alias:
aistudio:
- name: "gemini-3.5-flash-low"
alias: "gemini-3-flash-preview"
- name: "gemini-3.5-flash-low"
alias: "gemini-2.5-flash-lite"
- name: "gemini-3-pro-high"
alias: "gemini-3.1-pro-preview"
- name: "gemini-3-pro-high"
alias: "gemini-3.1-pro-preview-customtools"
gemini-cli:
- name: "gemini-3.5-flash-low"
alias: "gemini-3-flash-preview"
- name: "gemini-3.5-flash-low"
alias: "gemini-2.5-flash-lite"
- name: "gemini-3-pro-high"
alias: "gemini-3.1-pro-preview"
- name: "gemini-3-pro-high"
alias: "gemini-3.1-pro-preview-customtools"
antigravity:
- name: "gemini-3.5-flash-low"
alias: "gemini-3-flash-preview"
- name: "gemini-3.5-flash-low"
alias: "gemini-2.5-flash-lite"
- name: "gemini-3-pro-high"
alias: "gemini-3.1-pro-preview"
- name: "gemini-3-pro-high"
alias: "gemini-3.1-pro-preview-customtools"
EOF
The reason the same Gemini model is listed repeatedly is that I had some issues getting the system to select the correct models. You can leave the API key as YOUR_LOCAL_SECURE_TOKEN_KEY, or change it to anything, as it’s just a dummy key.
Link Your Active Web Subscription
Trigger the authentication wizard to capture your active browser login token and securely lock down the generated credential files:
# Generate the OAuth link
~/CLIProxyAPI/cli-proxy-api --antigravity-login
- Copy the printed URL, open it inside your host Windows browser where your paid consumer Google account is logged in, and authorize it.
- Once the terminal confirms a successful callback, run this command to restrict file permissions on the token store:
chmod 0600 ~/.cli-proxy-api/antigravity-*.json
Align Environment Variables
To prevent malformed double-prefix paths (/v1/v1beta/), the base environment URL must point cleanly to the root proxy server without a trailing /v1 suffix. Append the execution rules to your profile:
cat << 'EOF' >> ~/.bashrc
# CLIProxyAPI Gateway Redirects
export CODE_ASSIST_ENDPOINT="http://127.0.0.1:8317"
export GOOGLE_GEMINI_BASE_URL="http://127.0.0.1:8317"
export GEMINI_API_KEY="YOUR_LOCAL_SECURE_TOKEN_KEY"
EOF
source ~/.bashrc
You will get errors and not be able to proceed if you miss the above. Originally it said this:
GOOGLE_GEMINI_BASE_URL="http://127.0.0.1:8317/v1"
If it says anything else, just remove everything behind the port :8317" but keep the quotation mark.
Patch Client Initialization Settings
The client tool panics on boot if its default startup model doesn’t match an internal whitelist string. Enforce the correct default model profile right out of the gate by updating your global settings:
cat << 'EOF' > ~/.gemini/settings.json
{
"security": {
"auth": {
"selectedType": "gemini-api-key"
},
"enablePermanentToolApproval": true,
"autoAddToPolicyByDefault": true
},
"model": {
"name": "gemini-3-flash-preview"
},
"context": {
"loadMemoryFromIncludeDirectories": true
},
"tools": {
"sandboxNetworkAccess": true
},
"experimental": {
"voiceMode": true,
"modelSteering": true,
"directWebFetch": true
}
}
EOF
Map MemPalace and Launch
Now, link your local memory drawers to the client profile via the Model Context Protocol:
npx @google/gemini-cli mcp add mempalace /home/USER/MemPalace-Server/venv/bin/mempalace-mcp
If you installed MemPalace in another directory, use that instead. And as always, replace USER with the actual username in your WSL.
To run your workspace daily, make sure your proxy is running in one window, then launch the client in another:
# Window 1: Start the Proxy Service
~/CLIProxyAPI/cli-proxy-api --config ~/.cli-proxy-api/config.yaml
# Window 2: Open the Client Console
npx @google/gemini-cli
When prompted on startup, select 2. Use Gemini API Key. The terminal interface will load instantly, routing your complex local development workflows through your zero-cost subscription pipeline.
Disclaimer: I don’t know is this only will work temporarily, or for how long it might work. It works right now for me anyway, but Google might decide to close this door at any time. Or they might not. I also don’t know to which extent this will work, and I won’t risk having my account banned to find out. So be careful and don’t spin up a lot of agents from start and have them do some heavy work.
If you like this type of content, make sure to not miss an update by signing up for my newsletter.
All the AI related work I do, I do on my spare time and most of it I share with the world completely free of charge. It does take up a lot of my time, as well as the cost for running this website. I’m grateful for every bit of support I can get from users like you.
