Embedding inversion is a privacy attack where the attacker reconstructs the original input text from a stored embedding vector. Modern embedding models lose less information than is intuitive; published research has demonstrated that significant portions of the original text can be recovered with access to the embedding alone and a similar embedding model.
The implication for AI-SPM is that a vector store containing embeddings of sensitive documents is itself a sensitive store, even if the original documents are kept elsewhere. Treating an embedding store as a low-sensitivity index is a recurring audit finding.
Defences include using embedding models with reduced inversion fidelity, applying noise during indexing (with the trade-off of degraded retrieval quality), and treating the vector store with the same access controls as the source documents.