Retrieval-Augmented Generation (RAG) has become the de-facto standard for enterprise AI. But in banking, “enterprise” means a complex web of permissions, Chinese walls, and need-to-know access controls. RAG that respects these boundaries is non-negotiable.
The Security Challenge in Vector Search
Vector databases excel at finding semantically similar text. They are terrible at access control lists (ACLs). A standard vector query looks like this: “Find the 5 chunks most similar to ‘CEO Salary’.” The database searches the entire index and returns the top 5 matches.
If one of those matches is a confidential HR document, you have a data leak. Post-filtering (retrieving 100 documents, checking permissions, and throwing away the unauthorized ones) is inefficient and risky. It messes up your relevance scoring and latency.
The Solution: Strict Pre-Filtering
We advocate for a Pre-Filtering approach where the search space is narrowed before the vector comparison happens. This requires a robust metadata strategy during ingestion.
1. Ingestion Pipeline Strategy
When a document is ingested from SharePoint or DMS, we must capture its ACLs. We map these to a flattened list of “Access Tokens”.
{
"doc_id": "hr_policy_2025.pdf",
"chunk_id": "102",
"content": "The CEO's variable pay is capped at...",
"metadata": {
"department": ["HR", "Board"],
"level": ["L6", "L7", "CXO"],
"access_tokens": ["group:hr_admins", "user:ceo@bank.com"]
}
}
2. Query-Time Identity Resolution
When a user logs in, we fetch their group memberships from Azure AD or LDAP.
User: john.doe@bank.com
Groups: ["marketing_team", "mumbai_branch", "level_l4"]
3. The Filtered Vector Search
The actual query sent to the vector store (e.g., Pinecone, Milvus, or Weaviate) includes a filter clause. We effectively say: “Find chunks similar to ‘Salary’ WHERE (access_tokens INTERSECT user_groups IS NOT EMPTY).”
Handling Hierarchies and Inheritance
Banking hierarchies are deep. A Regional Head should see everything a Zonal Head sees, who sees everything a Branch Manager sees. To handle this without exploding the metadata, we implement Group Expansion at query time.
If a user is a “Regional Head”, our auth middleware expands this to include [“Zonal Head”, “Branch Manager”, “Sales Officer”]. This ensures that higher-ups automatically inherit lower-level access without needing explicit tagging on every document.
Audit and Governance
Every single chunk retrieved must be logged. The log should state: “User X queried Y. System retrieved chunks A, B, C because User X had Permission Z.” This level of traceability is essential for the ISO 27001 and SOC2 audits that banks undergo regularly.