Day 40 of 100 Days Agentic Engineer Challenge: Long-Term Memory — Hybrid Approach

4 min readFeb 13, 2025

I am still looking for a last missing parts of AI agents parts to finish some of my projects and add an agent functionality to them, as well as to build a standalone agent. I like the Mem0 approach, there is a cloud option to test and launch and as it’s open source under Apache License 2.0, it’s possible to host it on your own evoirenment. My main issue is to be able to save all data together with timestamp, but also the agent should be able to list all details related to 100 or more days. The typical RAG solution won’t work because vector databases are based on similarity search and return from 4 to 20 results, so it’s not possible to query hundreds of tasks from particular time interval or category. I think the best approach here would be a hybrid approach. But before I go into details, let me review my daily task routine.

Daily Tasks Routine

Physical Activity — I did 50 squats.
Seven hours of sleep —I slept for 7 hours, but I wake up around 8–9 o’clock, it’s too late, I need to go to bed earlier.
AI Agent — I’m still testing different long-term memory solutions for my agent.
PAIC — In queue.
Data Science — In queue.

If you want to know what all these tasks are about, read the introduction to the 100 Days Agentic Engineer Challenge.

Long-term Memory — Hybrid Structured Approach

To implement long-term memory in an AI chat system with date/time tracking and real-time retrieval, follow this structured approach:

1. Storage Architecture

Use a hybrid database system to balance structured data and semantic search:

Relational Database (e.g., PostgreSQL):

Stores raw chat messages with metadata:

CREATE TABLE memories (
    id SERIAL PRIMARY KEY,
    user_id VARCHAR(255),
    message TEXT,
    created_at TIMESTAMP,
    embedding_id UUID  -- Optional link to vector DB
);

Efficient for time/user-based queries (e.g., “fetch last week’s messages”).

Vector Database (e.g., Pinecone, Weaviate):

Stores embeddings of messages with metadata (user ID, timestamp).
Enables semantic similarity searches (e.g., “find conversations about vacations”).

2. Storing Memories

Step 1: Save each user message to the relational database with a timestamp.
Step 2: Generate an embedding (e.g., using OpenAI’s text-embedding-ada-002) for the message.
Step 3: Store the embedding in the vector database with metadata:

# Example using Pinecone
vector_db.upsert(
    vectors=[
        (vector_id, embedding, {"user_id": "123", "timestamp": "2024-05-20"})
    ]
)

3. Retrieval Strategies

Combine semantic relevance and time-based filtering for context-aware responses:

A. Real-Time Context Injection

Vector Search:
Query the vector DB for semantically similar messages, filtered by user_id and recent timestamps:

results = vector_db.query(
    vector=current_message_embedding,
    filter={
        "user_id": "123",
        "timestamp": {"$gte": "2024-05-01"}
    },
    top_k=5
)

Relational DB Lookup:
Fetch recent messages (e.g., last 24 hours) for short-term context:

SELECT * FROM memories 
WHERE user_id = '123' 
AND created_at >= NOW() - INTERVAL '1 day'
ORDER BY created_at DESC;

B. Time-Specific Queries

Fetch memories from a specific date range:

SELECT * FROM memories 
WHERE user_id = '123' 
AND created_at BETWEEN '2024-05-01' AND '2024-05-20';

4. Integration with AI Models

Prompt Engineering: Inject retrieved memories into the AI’s prompt:

System: You are a helpful assistant. Use this context:
- [2024-05-10] User: I have a dog named Max.
- [2024-05-19] User: I’m planning a trip to Spain.

User: What should I pack for my trip?

Use frameworks like LangChain or LlamaIndex to automate context retrieval and injection.

5. Optimization & Challenges

Indexing: Add indexes on user_id and created_at in the relational DB.
Cost: Use compression (e.g., quantization) for vector embeddings to reduce storage.
Privacy: Anonymize data and encrypt sensitive fields.
Hybrid Search: Tools like Elasticsearch (with plugins for vectors) can unify text and semantic search.

Tools & Libraries

Embeddings: OpenAI, Sentence Transformers.
Vector DBs: Pinecone, Weaviate, FAISS (local).
Relational DBs: PostgreSQL, MySQL.
Orchestration: LangChain, LlamaIndex.

Example Workflow

User Input: “What’s the weather like today?”
Retrieval:

Vector DB: Finds past discussions about weather preferences.
Relational DB: Fetches location from yesterday’s message: “I’m in Berlin.”

3. AI Response: “In Berlin, it’s 20°C and sunny. Don’t forget your sunglasses!”

This approach ensures the AI recalls both relevant content and temporal context, enabling personalized, coherent interactions. I will try to integrate it into a simple chat app first and will know how it works in a few days.