F POLARIS · FHIR
sdk · fhir · de-identification-ig

De-identification IG catalogue.

The three Library manifests from io.cognovis.de-identification.de@0.11.0 that drive AnonymizingTransport — PII fields, free-text scrubbing patterns, and quasi-identifier k-floor thresholds. All tables are machine-generated from the IG source at build time.

package: io.cognovis.de-identification.de
version: 0.11.0

Library-pii-fields-manifest

PII_FIELDS — structured direct identifiers.

Direct identifiers: fields that unambiguously identify an individual without aggregation. These fields are deleted from FHIR resources by filterPiiFromResource(), which is called by AnonymizingTransport as transformation 2. Array sub-paths like dosageInstruction[].text remove only the named sub-field, preserving the rest of the array element.

// Source file: packages/fhir-de/src/client/generated/pii-fields.ts
// Source IG:   io.cognovis.de-identification.de@0.11.0

import { PII_FIELDS, filterPiiFromResource } from '@polaris/sdk/fhir'
Resource type Removed field paths Fields
Account name, owner 2
CarePlan author 1
ChargeItem enterer, performingOrganization 2
Coverage identifier, payor 2
DocumentReference content[].attachment.url 1
ExplanationOfBenefit disposition 1
Location address, description, name, position 4
MedicationDispense dosageInstruction[].text 1
MedicationRequest dosageInstruction[].text 1
Patient address, identifier, name, telecom 4
Practitioner address, birthDate, name, telecom 4
Provenance agent 1
Specimen processing[].description 1

Using filterPiiFromResource directly

import { filterPiiFromResource, PII_FIELDS } from '@polaris/sdk/fhir'

// Filter a single resource without AnonymizingTransport
const cleaned = filterPiiFromResource(
  { resourceType: 'Patient', id: 'p-1', name: [...], gender: 'male' },
  'Patient'
)
// cleaned.name === undefined
// cleaned.gender === 'male'  (preserved)

// Inspect the catalogue
console.log(PII_FIELDS.Patient)
// ['address', 'identifier', 'name', 'telecom']

Array sub-path semantics

// 'dosageInstruction[].text' means:
// for each element in dosageInstruction[], delete .text
// but preserve all other sub-fields

// Before:
dosageInstruction: [{
  text: 'private dosage note',   // ← removed
  route: { coding: [...] },      // ← preserved
  timing: { repeat: {...} },     // ← preserved
  doseAndRate: [...],             // ← preserved
}]

// After:
dosageInstruction: [{
  route: { coding: [...] },
  timing: { repeat: {...} },
  doseAndRate: [...],
}]

Library-scrub-patterns

FREE_TEXT_FIELDS + FREE_TEXT_PATTERNS — narrative scrubbing.

Two exports from a single generated file: field paths that contain clinical narrative text, and the regex patterns used to scrub PII tokens from those fields. AnonymizingTransport applies all four patterns to every field in the FREE_TEXT_FIELDS catalogue (transformation 5).

// Source file: packages/fhir-de/src/client/generated/free-text-patterns.ts
// Source IG:   io.cognovis.de-identification.de@0.11.0

import { FREE_TEXT_FIELDS, FREE_TEXT_PATTERNS } from '@polaris/sdk/fhir'

FREE_TEXT_FIELDS

Resource type Scrubbed field path
AllergyIntolerance note[].text
CarePlan note[].text
ChargeItem note[].text
DiagnosticReport conclusion
DocumentReference description
MedicationAdministration dosage.text
Observation valueString

FREE_TEXT_PATTERNS

ID Replacement Example match
de-titled-name [NAME] "Dr. Mustermann", "Hr. Schmidt", "Fr. Weber"
de-date [DATE] "15.03.1985", "07.04.2025"
de-german-phone [TEL] "+49 30 1234567", "030 9876543"
de-kvnr [KV-NR] "A123456789"

Scope limitation

Bare name bigrams (Firstname Lastname without a title) are excluded to avoid false positives on clinical terms like "Diabetes Mellitus" or "Akute Otitis Media". Only titled names (Dr., Hr., Fr.) are caught by de-titled-name.

Using FREE_TEXT_PATTERNS directly

import { FREE_TEXT_PATTERNS } from '@polaris/sdk/fhir'

// Each pattern is a { id, pattern, replacement } object
for (const { id, pattern, replacement } of FREE_TEXT_PATTERNS) {
  console.log(id, pattern.toString())
}

// Apply all patterns to a string:
function scrub(text: string): string {
  let result = text
  for (const { pattern, replacement } of FREE_TEXT_PATTERNS) {
    result = result.replace(pattern, replacement)
  }
  return result
}

scrub('Patient Dr. Mustermann, born 15.03.1985, KV: A123456789')
// → 'Patient [NAME], born [DATE], KV: [KV-NR]'

Library-quasi-id-k-floors-manifest

QUASI_ID_K_FLOORS — k-anonymity thresholds.

Quasi-identifiers are fields that cannot identify an individual in isolation but can do so in combination (birthDate, postal code, occupation). The k-floor is the minimum group size required before such a field can be released in an analytics context. AnonymizingTransport does not enforce k-anonymity directly — these thresholds are used by analytics pipelines (aggregation, reporting) that consume anonymized FHIR data.

// Source file: packages/fhir-de/src/client/generated/quasi-id-k-floors.ts
// Source IG:   io.cognovis.de-identification.de@0.11.0

import {
  QUASI_ID_K_FLOORS,
  DEFAULT_QUASI_ID_K_FLOOR,
  getQuasiIdKFloor,
} from '@polaris/fhir-de'  // re-exported from generated file
Resource type Field k-floor Notes
Condition icd-3char 5 ICD-10 code at 3-character level (e.g. J00) — requires at least 5 patients per group
Condition rare-icd-categories 11 Rare disease ICD categories — higher floor due to higher re-identification risk
Patient birthDate 11 Full birth date — high re-identification risk, stricter floor
Patient occupation 5 Patient occupation code
Patient plz 5 German postal code (Postleitzahl)
DEFAULT all other fields 5 DEFAULT_QUASI_ID_K_FLOOR — fallback for fields not explicitly listed

Using getQuasiIdKFloor

import {
  getQuasiIdKFloor,
  DEFAULT_QUASI_ID_K_FLOOR,
  QUASI_ID_K_FLOORS,
} from '@polaris/fhir-de'  // note: from @polaris/fhir-de, not @polaris/sdk/fhir

// Get the k-floor for a specific field:
getQuasiIdKFloor('Patient', 'birthDate')     // → 11
getQuasiIdKFloor('Patient', 'plz')           // → 5
getQuasiIdKFloor('Condition', 'icd-3char')   // → 5
getQuasiIdKFloor('Unknown', 'someField')     // → 5 (DEFAULT_QUASI_ID_K_FLOOR)

// Check if a cohort is large enough before releasing a field:
function isKAnonymous(
  resourceType: string,
  fieldPath: string,
  cohortSize: number
): boolean {
  return cohortSize >= getQuasiIdKFloor(resourceType, fieldPath)
}

isKAnonymous('Patient', 'birthDate', 10)  // → false (k=11 required)
isKAnonymous('Patient', 'birthDate', 11)  // → true

IG version

Version tracing and regeneration.

All three generated files carry a machine-generated header with the source IG version. The TypeScript constants DE_IDENTIFICATION_IG_VERSION and DE_IDENTIFICATION_IG_PACKAGE in pii-fields.ts are the canonical source of truth for the bundled version.

Current bundled version

Package
io.cognovis.de-identification.de
Version
0.11.0

Regenerating the catalogue

# Regenerate all three generated files from the IG:
bun run --cwd packages/fhir-de generate:deidentification

# The generated files carry a DO NOT EDIT header:
# Source: io.cognovis.de-identification.de@0.11.0
# Regenerate: bun run --cwd packages/fhir-de generate:deidentification

IG version pinning with requireIgVersion

Import DE_IDENTIFICATION_IG_VERSION and pass it as requireIgVersion to AnonymizingTransport if you want startup-time validation that the bundled IG version matches what your code expects. See the AnonymizingTransport docs for details.