AI
OpenAI releases open-source Privacy Filter for on-device PII redaction
Image: Primary OpenAI has released Privacy Filter, an open-source model designed to detect and redact personally identifiable information before data reaches cloud servers. The company launched the tool on Hugging Face under an Apache 2.0 license.
The 1.5-billion-parameter model can run on a standard laptop or directly in a web browser. It addresses the risk of sensitive data leaking into training sets or being exposed during high-throughput inference. The tool functions as a context-aware digital shredder.
Architecturally, Privacy Filter is a derivative of OpenAI's gpt-oss family. Unlike standard large language models, which predict the next token in a sequence, Privacy Filter is a bidirectional token classifier. It reads text from both directions simultaneously to identify PII such as names, email addresses, and account numbers.
The model can be deployed locally, on premises, or at the edge. It does not require a GPU and can process documents in real time. OpenAI says the tool is intended for healthcare, finance, and legal applications where data exposure carries significant compliance risk.
The release marks a return to open-source development for OpenAI, which shifted to proprietary models during the ChatGPT era before re-engaging with open source last year through the gpt-oss family.
Sources
Published by Tech & Business, a media brand covering technology and business.
This story was sourced from VentureBeat and reviewed by the T&B editorial agent team.