Protect sensitive information in application logs, server logs, and audit trails. Maintain debugging capability while ensuring privacy compliance across all your logging infrastructure.
Comprehensive log protection
Process JSON logs, syslog, Apache/Nginx, application logs, and custom formats.
Handle high-volume log streams in real-time with minimal latency.
Maintain log structure and correlation IDs while redacting PII.
Integrate with ELK, Splunk, Datadog, and other log management systems.
Redact historical log archives for compliance remediation.
Understand log context to avoid false positives in technical fields.
Simple integration, powerful results
Send your documents, text, or files through our secure API endpoint or web interface.
Our AI analyzes content to identify all sensitive information types with 99.7% accuracy.
Sensitive data is automatically redacted based on your configured compliance rules.
Receive your redacted content with full audit trail and compliance documentation.
Get started with just a few lines of code
import requests
api_key = "your_api_key"
url = "https://api.redactionapi.net/v1/redact"
data = {
"text": "John Smith's SSN is 123-45-6789",
"redaction_types": ["ssn", "person_name"],
"output_format": "redacted"
}
response = requests.post(url,
headers={"Authorization": f"Bearer {api_key}"},
json=data
)
print(response.json())
# Output: {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
const axios = require('axios');
const apiKey = 'your_api_key';
const url = 'https://api.redactionapi.net/v1/redact';
const data = {
text: "John Smith's SSN is 123-45-6789",
redaction_types: ["ssn", "person_name"],
output_format: "redacted"
};
axios.post(url, data, {
headers: { 'Authorization': `Bearer ${apiKey}` }
})
.then(response => {
console.log(response.data);
// Output: {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
});
curl -X POST https://api.redactionapi.net/v1/redact \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"text": "John Smith's SSN is 123-45-6789",
"redaction_types": ["ssn", "person_name"],
"output_format": "redacted"
}'
# Response:
# {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
Application logs are essential for debugging, monitoring, and security analysis—but they inevitably capture personal information. User emails appear in authentication logs, customer names in transaction records, IP addresses in access logs, and sensitive data in error messages that expose request payloads. This creates a tension between operational needs and privacy requirements: developers need detailed logs to troubleshoot issues, while regulations like GDPR require minimization and protection of personal data.
Log redaction resolves this tension by automatically detecting and protecting PII while preserving the technical information needed for operations. Whether processing real-time log streams or remediating historical archives, intelligent redaction ensures your logs remain useful for their intended purpose without exposing sensitive data to unauthorized access or creating compliance violations.
Common sources of personal information in logs:
Authentication Logs:
// Login events capture user identifiers
2024-01-15 10:30:00 INFO [auth] Login attempt: [email protected] ip=192.168.1.100
2024-01-15 10:30:01 ERROR [auth] Failed login for [email protected]: invalid password
2024-01-15 10:30:02 INFO [auth] Password reset requested for [email protected]
// PII present: email addresses, IP addresses
Application Logs:
// Request/response logging
2024-01-15 10:31:00 DEBUG [api] POST /users {"name":"John Smith","email":"[email protected]","ssn":"123-45-6789"}
2024-01-15 10:31:01 INFO [order] Order created for customer_id=12345 (Jane Doe, [email protected])
2024-01-15 10:31:02 ERROR [payment] Payment failed for card ending 4242, customer: Bob Wilson
// PII present: names, emails, SSN, partial card numbers
Error Logs:
// Exceptions often expose sensitive data
2024-01-15 10:32:00 ERROR [db] Query failed: SELECT * FROM users WHERE email='[email protected]'
2024-01-15 10:32:01 FATAL [app] Unhandled exception processing user John Smith (555-123-4567)
Stack trace: ...
Request body: {"credit_card":"4111111111111111","cvv":"123"}
// PII in error context: queries, stack traces, request payloads
Access Logs:
// Web server access logs
192.168.1.100 - [email protected] [15/Jan/2024:10:33:00 +0000] "GET /account/profile HTTP/1.1" 200 1234
192.168.1.101 - - [15/Jan/2024:10:33:01 +0000] "POST /api/[email protected] HTTP/1.1" 201 89
// PII present: IP addresses, usernames, emails in URLs
Process log streams as they're generated:
Streaming Architecture:
// Log source → Redaction → Log destination
Application → Fluentd/Logstash → RedactionAPI → Elasticsearch
↓
(Redacted stream)
// Processing flow
1. Log event generated by application
2. Log shipper collects event
3. Event sent to redaction service
4. PII detected and redacted
5. Clean event forwarded to storage/analysis
Logstash Filter Integration:
# logstash.conf
filter {
http {
url => "https://api.redactionapi.net/v1/redact/stream"
verb => "POST"
headers => {
"Authorization" => "Bearer ${REDACTION_API_KEY}"
}
body => {
"text" => "%{message}"
"format" => "log"
}
target_body => "redacted"
}
mutate {
replace => { "message" => "%{[redacted][text]}" }
}
}
Fluent Bit Configuration:
[FILTER]
Name lua
Match *
script redact.lua
call redact_pii
[OUTPUT]
Name http
Match *
Host api.redactionapi.net
Port 443
URI /v1/redact/batch
Format json
tls On
Handle various log formats intelligently:
JSON Logs:
// Input
{"timestamp":"2024-01-15T10:30:00Z","level":"INFO","user":"[email protected]","action":"login","ip":"192.168.1.100"}
// Output (field-aware redaction)
{"timestamp":"2024-01-15T10:30:00Z","level":"INFO","user":"[EMAIL]","action":"login","ip":"[IP_ADDRESS]"}
// Structure preserved, PII redacted by field type
Syslog Format:
// RFC 5424 syslog
<165>1 2024-01-15T10:30:00Z myhost myapp 1234 - - User [email protected] logged in from 192.168.1.100
// Redacted
<165>1 2024-01-15T10:30:00Z myhost myapp 1234 - - User [EMAIL] logged in from [IP_ADDRESS]
// Syslog structure (priority, timestamp, host, app) preserved
Apache/Nginx Access Logs:
// Combined log format
192.168.1.100 - [email protected] [15/Jan/2024:10:30:00 +0000] "GET /user/profile HTTP/1.1" 200 1234 "https://example.com" "Mozilla/5.0..."
// Redacted
[IP_ADDRESS] - [EMAIL] [15/Jan/2024:10:30:00 +0000] "GET /user/profile HTTP/1.1" 200 1234 "https://example.com" "Mozilla/5.0..."
// Log analysis tools still parse correctly
Custom Formats:
// Define custom format patterns
{
"format": "custom",
"pattern": "{timestamp} [{level}] {message}",
"fields": {
"timestamp": {"type": "timestamp", "redact": false},
"level": {"type": "loglevel", "redact": false},
"message": {"type": "text", "redact": true}
}
}
Maintain debugging and analysis capabilities:
Correlation ID Preservation:
// Correlation IDs link related log entries
{"trace_id":"abc123","user":"[email protected]","action":"checkout"}
{"trace_id":"abc123","user":"[email protected]","action":"payment"}
{"trace_id":"abc123","user":"[email protected]","action":"confirmation"}
// Redacted with consistent tokenization
{"trace_id":"abc123","user":"tok_user_001","action":"checkout"}
{"trace_id":"abc123","user":"tok_user_001","action":"payment"}
{"trace_id":"abc123","user":"tok_user_001","action":"confirmation"}
// Same user = same token for journey analysis
IP Address Handling Options:
// Full redaction
192.168.1.100 → [IP_ADDRESS]
// Partial masking (preserve network)
192.168.1.100 → 192.168.1.xxx
// Geographic preservation
192.168.1.100 → [IP_ADDRESS:US:CA:San Francisco]
// Hashing (consistent reference)
192.168.1.100 → ip_hash_a1b2c3d4
Timestamp Precision:
// Preserve exact timestamps for debugging
// Optional: Generalize for additional privacy
"2024-01-15T10:30:45.123Z" → "2024-01-15T10:30:45.123Z" // Keep
"2024-01-15T10:30:45.123Z" → "2024-01-15T10:30:00Z" // Minute precision
"2024-01-15T10:30:45.123Z" → "2024-01-15" // Date only
Avoid false positives in technical contexts:
// Technical data that looks like PII but isn't
// UUIDs (not personal identifiers)
user_id: 550e8400-e29b-41d4-a716-446655440000 // Don't redact
// Numeric codes in technical contexts
error_code: 123-45-6789 // Not an SSN in this context
// Hostnames with email-like patterns
smtp.example.com // Not an email address
// Version numbers
version: 1.234.5 // Not a phone number
// Context indicators help avoid false positives:
- Field names (error_code vs ssn)
- Surrounding text ("version" vs "phone")
- Format validation (UUID regex vs SSN regex)
Elasticsearch/ELK Stack:
// Logstash pipeline with redaction
input {
beats { port => 5044 }
}
filter {
# Parse log format
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
# Redact PII fields
http {
url => "https://api.redactionapi.net/v1/redact"
body => {
"fields" => {
"clientip" => "%{clientip}",
"auth" => "%{auth}",
"request" => "%{request}"
}
}
}
}
output {
elasticsearch { hosts => ["elasticsearch:9200"] }
}
Splunk Integration:
# Splunk HEC with preprocessing
# Use Splunk's modular inputs or HEC event modification
# props.conf - transform before indexing
[source::myapp_logs]
TRANSFORMS-redact = redact_pii
# transforms.conf
[redact_pii]
REGEX = email=([^\s]+)
FORMAT = email=[EMAIL]
DEST_KEY = _raw
Datadog Integration:
// Datadog Log Pipeline with redaction
// Use Datadog's Sensitive Data Scanner or preprocess
// Agent configuration
logs:
- type: file
path: /var/log/myapp/*.log
processing_rules:
- type: mask_sequences
name: redact_emails
pattern: \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b
replace_placeholder: "[EMAIL]"
Remediate existing log archives:
// Batch process historical logs
const archiveJob = await redactionClient.createBatchJob({
source: {
type: 's3',
bucket: 'company-log-archives',
prefix: 'logs/2023/',
pattern: '*.log.gz'
},
processing: {
format: 'auto_detect',
compression: 'gzip',
parallelism: 10
},
output: {
bucket: 'company-log-archives-redacted',
prefix: 'logs/2023/',
preserveStructure: true
}
});
// Monitor progress
while (job.status !== 'completed') {
const status = await redactionClient.getJobStatus(archiveJob.id);
console.log(`Processed: ${status.filesProcessed}/${status.totalFiles}`);
await sleep(10000);
}
GDPR Log Requirements:
Security Logging Requirements:
// Stream processing endpoint
POST /v1/redact/logs
Content-Type: application/x-ndjson
{"timestamp":"2024-01-15T10:30:00Z","message":"User [email protected] logged in"}
{"timestamp":"2024-01-15T10:30:01Z","message":"Order placed by Jane Doe (555-1234)"}
Response (streaming):
{"timestamp":"2024-01-15T10:30:00Z","message":"User [EMAIL] logged in"}
{"timestamp":"2024-01-15T10:30:01Z","message":"Order placed by [NAME] ([PHONE])"}
// Batch processing for files
POST /v1/redact/file
{
"file_url": "s3://logs/application.log.gz",
"format": "auto",
"compression": "gzip"
}
RedactionAPI has transformed our document processing workflow. We've reduced manual redaction time by 95% while achieving better accuracy than our previous manual process.
The API integration was seamless. Within a week, we had automated redaction running across all our customer support channels, ensuring GDPR compliance effortlessly.
We process over 50,000 legal documents monthly. RedactionAPI handles it all with incredible accuracy and speed. It's become an essential part of our legal tech stack.
The multi-language support is outstanding. We operate in 30 countries and RedactionAPI handles all our documents regardless of language with consistent accuracy.
Trusted by 500+ enterprises worldwide





Logs often capture user activity including emails in login attempts, IP addresses, request parameters with user data, error messages with customer details, and audit trails. This data is useful for debugging but creates compliance and privacy risks.
We offer streaming API endpoints optimized for log processing. Logs can be processed in batches or as continuous streams, with sub-millisecond per-event latency for real-time pipelines. Horizontal scaling handles any volume.
We support JSON logs, syslog (RFC 3164/5424), Apache/Nginx access and error logs, application logs (Log4j, Winston, etc.), CSV/TSV log exports, and custom formats. Format detection is automatic or can be specified.
We preserve log structure, timestamps, log levels, and correlation IDs. Tokenization options maintain referential integrity for user journey analysis. IP addresses can be masked to preserve geographic data while removing identification.
Yes, we integrate with major platforms: ELK Stack (via Logstash filter), Splunk (via HEC), Datadog, Sumo Logic, and cloud logging services (CloudWatch, Stackdriver). Custom integrations available via API.
Retroactive processing handles historical logs for compliance remediation. Process archived files in S3, Azure Blob, GCS, or on-premises storage. Batch processing optimized for large archive volumes.