Browser-only. File never uploaded

Anonymize a PDF or DOCX before pasting it into ChatGPT

Drop a PDF or DOCX, the text is extracted in your browser, structured PII (emails, phones, Spanish DNIs/NIEs/NIFs, IBANs, credit cards) is detected and replaced with the method you pick. Output is a safe .txt to paste into ChatGPT.

Drop a document to anonymize

PDF, DOCX or TXT. Text is extracted in your browser, the file is never uploaded.

Supported: .pdf, .docx, .doc, .txt

Why a separate flow for documents

Sharing a PDF report or a Word doc with ChatGPT or Claude is the most common AI leak scenario for non-technical users. Customer contracts, support emails forwarded as .eml, expense reports. They carry direct identifiers. This flow extracts the prose locally and runs the same regex-based PII detector used by the CSV flow over it. Browser-only, free up to 2k lines.

  • PDFs and DOCX never reach our servers. Pdf.js and mammoth run client-side
  • Detects emails, phones, Spanish DNI/NIE/NIF, IBAN and credit cards reliably
  • Pseudonymize keeps the mapping so you can reverse later with the same reverse tool
  • Honest about limits: names and addresses need NER and may be missed. You review before download

How it works

  1. 1
    Drop the document
    PDF, DOCX or TXT. The browser extracts the text directly; the binary file never uploads.
  2. 2
    Pick a method per PII type
    For each kind detected, choose redact / hash / faker / pseudonymize. Defaults pseudonymize the reversible ones and redact credit cards.
  3. 3
    Review and download
    Preview the anonymized text in the browser, then download the .txt and the mapping file (JSON or CSV).
  4. 4
    Paste into ChatGPT and reverse
    Paste the .txt into ChatGPT, run your analysis, then use the reverse tool with the same mapping to translate the AI answer back.

What this can and cannot do

The detector is regex-driven with format validators (Spanish DNI letter check, IBAN mod-97, credit card Luhn). This means very low false positives. When it flags an IBAN, it really is an IBAN. The trade-off: only structured identifiers are detected.

Names like 'María García' and addresses like 'Calle Mayor 12' are NOT detected without a named-entity recognition model. We surface this as a warning before download. You review the preview, edit if needed, then export. For most ChatGPT use cases this catches the high-stakes leaks (emails, phones, IDs) and the user can sanity-check the rest in seconds.

Reverse the AI answer locally

When ChatGPT returns a response referencing EMAIL_0001 or DNI_0003, paste it into the reverse tool with the mapping file. Token-by-token replacement, all in your browser.

Open the reverse tool →

Related tools

Frequently asked questions

Does my PDF leave the browser?+

No. pdf.js and mammoth run as JavaScript inside your browser. The binary file never reaches our servers. Only metadata about the operation (line count, byte size) is sent for the paid-tier gate.

What gets detected and what does not?+

Detected reliably: emails, phone numbers, Spanish DNI/NIE/NIF (with valid check letter), IBANs (with valid mod-97), credit cards (Luhn). Not detected: people names, company names, addresses, free-form descriptors. Those need NER and are out of scope without a model.

Can I edit the text before downloading?+

The preview is read-only in this first version. You review it and decide whether to download. A future version will let you edit the output for residual names/addresses inline.

How is it priced?+

Same tier table as the CSV flow: free up to 2,000 lines, then $3/$7/$15/$29 by line count. Reverse uses the same tier table.

Can I anonymize a contract with hundreds of names?+

You can run it, but only the structured identifiers will be replaced. For contracts where names are the main concern, this tool catches IDs and IBANs while you handle the names manually with find-and-replace. A future NER-enabled mode is on the roadmap.