Encryption and tokenization both protect sensitive data on IBM Z mainframes, but they serve different purposes. Encryption transforms data using a cryptographic key and is best for bulk data at rest, backups, and data in transit. Tokenization replaces sensitive values with non-sensitive surrogates and is best for field-level protection of PANs, SINs, and PHI — especially when the goal is PCI or HIPAA audit scope reduction. Most mature mainframe environments use both: encryption at the data-set level, tokenization at the field level. This guide compares both techniques across 19 dimensions specific to z/OS workloads.
Encryption is a reversible algorithmic transformation that uses a cryptographic key — typically AES-256 in XTS mode — to convert plaintext into ciphertext. On IBM Z mainframes, encryption is implemented through z/OS Data Set Encryption and Pervasive Encryption, with cryptographic operations offloaded to dedicated CPACF coprocessors and Crypto Express HSM cards. ICSF (Integrated Cryptographic Service Facility) manages keys through CKDS and PKDS keystores, with key ceremony, rotation, and access controlled by RACF policies. Unlike software-based encryption in cloud or SaaS platforms, where CPU cost is abstracted, mainframe encryption has a direct MIPS/MSU cost impact — hardware acceleration is what makes it economically viable at scale.
Tokenization replaces sensitive data elements — PANs, SINs, Social Security numbers, PHI — with non-sensitive surrogates called tokens that have no mathematical relationship to the original values. On mainframes, tokenization deploys via an API gateway or transparent proxy positioned between the application tier and data layer, intercepting sensitive values before they reach COBOL batch programs or DB2 tables. Vaulted tokenization stores original-to-token mappings in a secure vault; vaultless tokenization uses format-preserving algorithms (FF1/FF3-1) to generate tokens without vault dependency — DataStealth's agentless approach eliminates both vault infrastructure and code changes on the mainframe. Tokens maintain the same length and data type as the original value, which is critical for COBOL copybooks that enforce fixed-length PIC clauses and DB2 columns with strict type constraints.
Use encryption when you need to protect bulk data at rest, in transit, or in backups — anywhere referential integrity and search don't matter and you control the keys end-to-end. Use tokenization when sensitive fields (PAN, SIN, PHI) move through COBOL apps, analytics, or third parties and you want to shrink PCI/HIPAA audit scope. Most mature mainframe shops use both — encryption for the data set, tokenization for the field.
A bank running DB2 on z15 stores PANs across multiple COBOL batch applications and needs to achieve PCI DSS 4.0 compliance with minimal disruption. Encrypting PANs at the column level would force COBOL applications to handle variable-length ciphertext, triggering copybook changes across hundreds of programs.
Format-preserving tokenization replaces each PAN with a token that maintains the 16-digit structure. COBOL batch jobs process tokens as if they were real PANs — no code changes required. The cardholder data environment shrinks from dozens of systems to the tokenization proxy and vault, reducing audit scope and audit cost.
For a bank with PAN data flowing through multiple COBOL applications, tokenization delivers PCI scope reduction that encryption alone cannot achieve.
An insurance carrier backs up large VSAM datasets nightly to physical tape for disaster recovery and regulatory retention. Backup media is a historically common breach vector — lost or mishandled tapes expose policyholder data across years of records.
z/OS Pervasive Encryption with Crypto Express HSM encrypts data sets transparently. CPACF coprocessors offload cryptographic operations from general-purpose processors — minimal MIPS impact, estimable via IBM's zBNA planning tool. Backup data is encrypted at rest without touching any COBOL program or VSAM access method.
When the primary risk is data-at-rest exposure on backup media with no downstream application dependencies, encryption with hardware acceleration is the most cost-effective control.
A healthcare payer stores PHI in IMS and DB2 on the mainframe and feeds data to a cloud-based analytics platform for population health analysis. PHI moving off the mainframe extends HIPAA scope to the cloud environment and creates breach notification risk.
Tokenize PHI fields before data exits the mainframe perimeter. Format-preserving tokens maintain field lengths for ETL pipeline compatibility. The analytics platform processes tokens, not PHI. HIPAA safe harbour applies — tokenized data is not a reportable breach.
When protected health information moves from a mainframe to a cloud analytics platform, tokenization at the field level provides both HIPAA safe harbour and format compatibility that encryption would break.
Encryption meets PCI DSS 4.0 Req. 3.5 — strong cryptographic key management is required, and the encrypted data remains cardholder data. Every system with key access stays in your assessable cardholder data environment. Tokenization, when properly segmented, removes systems from PCI scope entirely. Your QSA must verify the architecture and segmentation. Tokenization is the only data protection technique that can remove mainframe systems from PCI DSS scope; encryption keeps all systems with key access in scope.
Both encryption and tokenization are acceptable safeguards under HIPAA's Security Rule, 45 CFR §164.312(a)(2)(iv). Tokenization is preferred when PHI moves to analytics platforms or third parties — HIPAA safe harbour under §164.514 applies to properly de-identified data. Both techniques satisfy HIPAA requirements, but tokenization provides safe harbour protection that encryption does not when PHI is shared externally.
Under GDPR Art. 32, encryption qualifies as pseudonymization at best — it is reversible with the key and the data remains personal data. Tokenization qualifies as strong pseudonymization under Art. 4(5) when using vaulted tokens with no mathematical relationship to the original value. This distinction affects your breach notification threshold — tokenized data may exempt you from Art. 34 notification obligations. Under PIPEDA, both are adequate safeguards under Principle 4.7. For Canadian enterprises operating cross-border, tokenization provides a stronger safeguard for international data transfers than encryption.
References: NIST SP 800-57 Part 1 Rev. 5: Recommendation for Key Management | PCI SSC Tokenization Product Security Guidelines
Encryption deployment on z/OS starts with ICSF configuration — activating the CKDS (symmetric key datastore), PKDS (public key datastore), and TKDS (PKCS#11 token datastore). Crypto Express HSM cards — generation-specific to your processor (CEX6S on z14, CEX7S on z15, CEX8S on z16/z17) — handle key generation and secure storage in CCA coprocessor mode. Key ceremony follows dual-control, split-knowledge procedures for master key loading via the Trusted Key Entry (TKE) workstation.
RACF policies — specifically CSFSERV and CSFKEYS profiles — control which users and batch jobs can access key material. Pervasive Encryption activates at the data-set level through DFSMS data class attributes — encrypted data sets must be SMS-managed and in extended format. Existing unencrypted data sets cannot be converted in place; data must be migrated to a new encrypted data set. SMF Type 82 records provide the audit trail for all cryptographic operations.
Tokenization deploys via a transparent proxy or API gateway that sits between the application tier and the data layer — no COBOL or PL/I code changes. The proxy intercepts data flows and tokenizes or detokenizes in the network path.
COBOL copybook field mapping identifies PIC clauses for PAN, SIN, and PHI fields and configures format-preserving token patterns. RACF service accounts authenticate the proxy to z/OS. If vaulted, the proxy connects to secure token storage (which can reside off-mainframe). If vaultless — the DataStealth model — algorithmic token generation runs without vault dependency, reducing infrastructure complexity and deployment time from months to days.
Neither is inherently more secure — they protect against different threat vectors. Encryption makes data unreadable without a key, concentrating risk in key management. Tokenization eliminates data exposure entirely — tokens are worthless without vault access, reducing your breach blast radius to zero. On z/OS, the strongest architecture combines both: Pervasive Encryption for data-set-level protection and tokenization for field-level protection of PANs, SINs, and PHI.
Yes. Format-preserving tokenization generates tokens that maintain the same field length and data type as the original value. COBOL copybooks enforce fixed-length PIC clauses — format-preserving tokens satisfy these constraints without modifying any PIC definition. Deployment via a transparent proxy intercepts data flows between the application and database, tokenizing and detokenizing without touching COBOL, PL/I, or Assembler code. DataStealth's agentless model deploys in days, not months.
Yes — this is the recommended architecture for mature mainframe environments. Encrypt at the data-set level using Pervasive Encryption for broad protection of VSAM, DB2, and sequential datasets. Tokenize at the field level for targeted protection of specific sensitive values — PANs, SINs, PHI. Layer encryption in transit (TLS or AT-TLS) for data moving between LPARs or to distributed systems. Each layer addresses a distinct risk vector without redundant complexity.
Encryption with CPACF hardware acceleration has minimal MIPS impact — cryptographic operations offload to dedicated coprocessors and do not consume general-purpose CPU. IBM provides the zBNA (z Batch Network Analyzer) tool to estimate the additional CPU cost of encryption for specific workloads. Without hardware acceleration, AES encryption can consume significant general-purpose processor capacity. Vaultless tokenization has comparable compute cost to format-preserving encryption. Vaulted tokenization adds I/O latency for vault lookups but minimal CPU. The key variable is whether your Crypto Express HSMs are already provisioned — if so, encryption's marginal MIPS cost is minimal.
Yes. Tokenization satisfies PCI DSS 4.0 and goes further — it can remove systems from PCI scope entirely. When sensitive cardholder data is replaced with tokens and the token vault is properly segmented from the cardholder data environment, the systems handling only tokens are no longer in scope for PCI assessment. This reduces your audit cost, the number of controls you must maintain, and the compliance personnel required. QSA verification of segmentation is mandatory.
DataStealth provides agentless, format-preserving tokenization for IBM Z environments: