Detect and redact sensitive data in Microsoft Excel spreadsheets. Process XLSX and legacy XLS files with formula preservation, multi-sheet handling, and hidden data removal.
Comprehensive spreadsheet support
Process all worksheets in a workbook, including hidden sheets that may contain sensitive data.
Preserve formulas while redacting cell values. Choose to protect formula references or redact completely.
Detect PII in named ranges and defined names used throughout the workbook.
Scan and redact cell comments that may contain reviewer notes with sensitive information.
Process chart titles, labels, and linked data that might display PII visually.
Find and handle hidden rows, columns, sheets, and workbook metadata.
Simple integration, powerful results
Send your documents, text, or files through our secure API endpoint or web interface.
Our AI analyzes content to identify all sensitive information types with 99.7% accuracy.
Sensitive data is automatically redacted based on your configured compliance rules.
Receive your redacted content with full audit trail and compliance documentation.
Get started with just a few lines of code
import requests
api_key = "your_api_key"
url = "https://api.redactionapi.net/v1/redact"
data = {
"text": "John Smith's SSN is 123-45-6789",
"redaction_types": ["ssn", "person_name"],
"output_format": "redacted"
}
response = requests.post(url,
headers={"Authorization": f"Bearer {api_key}"},
json=data
)
print(response.json())
# Output: {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
const axios = require('axios');
const apiKey = 'your_api_key';
const url = 'https://api.redactionapi.net/v1/redact';
const data = {
text: "John Smith's SSN is 123-45-6789",
redaction_types: ["ssn", "person_name"],
output_format: "redacted"
};
axios.post(url, data, {
headers: { 'Authorization': `Bearer ${apiKey}` }
})
.then(response => {
console.log(response.data);
// Output: {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
});
curl -X POST https://api.redactionapi.net/v1/redact \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"text": "John Smith's SSN is 123-45-6789",
"redaction_types": ["ssn", "person_name"],
"output_format": "redacted"
}'
# Response:
# {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
Microsoft Excel spreadsheets are ubiquitous in business operations—from customer lists and financial reports to HR records and analytics exports. Their tabular structure makes them ideal for organizing large amounts of data, but that same structure creates data protection challenges. A single Excel file might contain thousands of rows with customer names, addresses, account numbers, and financial details spread across multiple columns and sheets.
Beyond the visible data, Excel files contain hidden complexity: formulas referencing sensitive cells, hidden sheets with backup data, cell comments with reviewer notes, named ranges used throughout, and metadata about file authors. Effective Excel redaction must address all these elements while preserving the spreadsheet's functionality and formatting.
Excel files are more complex than simple tables of data:
Multiple Worksheets: Workbooks contain multiple sheets, each potentially with different data and different PII types. A "Master List" sheet might have full customer data while a "Summary" sheet has aggregated figures. Each sheet requires independent analysis.
Formulas and References: Cells containing formulas reference other cells, sheets, or external workbooks. A cell displaying "John Smith" might actually contain =A1 referencing another cell, or =VLOOKUP(...) pulling from a table. Redaction must understand these relationships.
Named Ranges: Excel allows naming cell ranges for use in formulas and navigation. Named ranges like "CustomerNames" or "SSN_List" may contain or reference PII.
Hidden Elements: Rows, columns, and entire sheets can be hidden. Hidden content often contains source data, intermediate calculations, or information not intended for display but still present in the file.
Cell Comments/Notes: Users add comments to cells for collaboration. Comments may contain PII in reviewer notes, questions, or explanations not visible in the main grid.
Formulas present unique redaction challenges requiring careful handling:
Preserve Formulas, Redact Values: Keep formulas functional while redacting the underlying data they reference. A SUM formula continues working, but individual values are redacted. This maintains spreadsheet functionality for future use.
Convert to Values, Then Redact: Calculate all formulas to static values, then redact those values. Removes formula complexity but loses spreadsheet functionality. Appropriate when sharing static snapshots.
Redact Formula Text: Some formulas contain literal PII in their text (=IF(A1="John Smith"...)). These require redaction of the formula text itself.
Reference Tracking: When redacting a cell referenced by formulas, consider impact on dependent cells. Breaking formulas may be acceptable or may require alternative approaches.
Hidden Excel content creates significant data protection considerations:
Hidden Rows/Columns: Users hide rows and columns to simplify views, but data remains in the file. Customer lists might hide columns with SSN or phone numbers that still export with the file.
Hidden Sheets: Entire worksheets can be hidden or "very hidden" (requiring VBA to unhide). These often contain source data, lookup tables, or sensitive information not intended for general view.
Processing Options: You can redact hidden content in place (stays hidden but protected), unhide and redact (makes visible with protection), or remove entirely (eliminates hidden data).
Very Hidden Sheets: Excel's "xlSheetVeryHidden" state requires special handling—sheets set this way don't appear in the unhide dialog. Our processing detects and handles these sheets.
Excel data often follows predictable structures enabling targeted processing:
Column-Based Rules: When spreadsheet structure is known, configure processing by column: "Column B contains names," "Column D contains SSN." This improves accuracy and performance for structured data.
Table Detection: Excel Tables (ListObjects) have defined structures with headers. We detect tables and can apply column-specific rules based on header text.
Header Row Recognition: Identify header rows to understand column content. Headers like "Social Security Number," "Phone," or "Email" indicate column contents.
Mixed Content Handling: Some spreadsheets mix structured tables with free-form content. Processing adapts to different regions within the same sheet.
Excel charts and graphics may display PII:
Chart Titles and Labels: Charts with titles like "Sales by John Smith" or data labels showing customer names require text redaction.
Linked Data: Charts linked to cell ranges display that data visually. Redacting source cells updates chart display.
Embedded Text: Text boxes, shapes, and SmartArt graphics may contain typed PII requiring detection and redaction.
Embedded Images: Images inserted in spreadsheets (screenshots, scanned documents) can be processed with OCR for text detection and visual redaction.
Excel files contain extensive metadata:
Document Properties: Author, last modified by, company, title, and custom properties. These reveal file origin and handling history.
Cell-Level Metadata: Each cell may have author information from changes, comment authors, and edit timestamps.
External Links: References to external files may reveal file paths and server names.
Scenario Manager: Excel scenarios store alternate data sets that may contain PII.
We process all Excel formats:
XLSX (Excel 2007+): Modern Office Open XML format. XML-based structure enables precise manipulation of content and formatting.
XLS (Excel 97-2003): Legacy binary format still common in many organizations. Full support with option to output as XLSX.
XLSM (Macro-Enabled): XLSX with macros. Macros can be preserved, removed, or flagged based on security policy.
XLSB (Binary): Binary format for large files. Supported for input with option for format conversion on output.
Excel files can grow very large with extensive data:
Streaming Processing: For files with millions of cells, streaming techniques process row-by-row without loading the entire file into memory.
Async Processing: Very large files can be processed asynchronously with webhook notification on completion.
Progress Tracking: For long-running jobs, progress updates indicate processing status.
Chunked Output: Large result files can be delivered in chunks or compressed for efficient transfer.
Excel redaction serves diverse industry needs:
Financial Services: Customer account lists, transaction reports, portfolio spreadsheets containing financial PII requiring protection before sharing or archival.
Healthcare: Patient lists, appointment schedules, billing spreadsheets with PHI requiring HIPAA-compliant protection.
Human Resources: Employee rosters, compensation data, performance reviews in spreadsheet format containing employment PII.
Marketing: Customer lists, campaign data, lead databases often maintained in Excel requiring protection before analytics or sharing.
Research: Survey data, research datasets, participant information in spreadsheets requiring de-identification for publication or sharing.
Multiple options for how redacted content appears:
Placeholder Text: Replace with markers like [REDACTED] or type-specific markers like [SSN_REDACTED].
Blank Cells: Clear cell contents entirely, preserving cell but removing data.
Masked Values: Partial masking showing some characters (***-**-1234 for SSN).
Tokenized Values: Consistent replacement tokens enabling data linking without real values.
Row/Column Removal: For highly sensitive data, remove entire rows or columns rather than individual values.
RedactionAPI has transformed our document processing workflow. We've reduced manual redaction time by 95% while achieving better accuracy than our previous manual process.
The API integration was seamless. Within a week, we had automated redaction running across all our customer support channels, ensuring GDPR compliance effortlessly.
We process over 50,000 legal documents monthly. RedactionAPI handles it all with incredible accuracy and speed. It's become an essential part of our legal tech stack.
The multi-language support is outstanding. We operate in 30 countries and RedactionAPI handles all our documents regardless of language with consistent accuracy.
Trusted by 500+ enterprises worldwide





You have options: preserve formulas while redacting referenced values (formula remains functional with redacted inputs), convert formulas to values then redact (static result), or redact formula text if it contains literal PII. We maintain formula integrity where possible.
Hidden content often contains sensitive data. We scan all hidden sheets, rows, and columns by default. You can choose to redact hidden content in place, unhide and redact, or remove hidden content entirely.
Yes, cell comments and notes are scanned for PII and redacted. Reviewer names in comment metadata can also be removed. This handles the common case of sensitive information in review notes.
We process spreadsheets with millions of cells efficiently using streaming techniques. Memory usage is optimized for large files. Very large workbooks may use async processing with results delivered via webhook.
Yes, you can configure column-based processing—specify which columns contain PII types, or which columns to skip. This enables efficient processing of structured spreadsheets with known schemas.
We support XLSX files from all sources including Excel for Mac, Excel Online, and files exported from Google Sheets. The processing is format-based, not application-specific.