The variant is stripped.
The genome is not.
Genomics consortiums, research institutes, and the partners pitching you consortium access all want the same thing: VCF and BAM files that are genuinely safe to share. The problem is that genomic data is inherently re-identifiable. Removing the sample ID reduces risk - it does not eliminate it. GenomeIQ™ applies population-level population-level enforcement, rare-variant suppression, and family-link de-linking before any file moves outside your sequencing environment.
What GenomeIQ™ replaces.
Everything you are currently doing to satisfy your HREC - and hoping is enough.
Three things that change immediately.
Deployed in your sequencing environment
GenomeIQ Agent installed on-prem, connected to your LIMS or sequencer output directory. No raw genomic data touches external infrastructure. Half-day engagement.
k-threshold locked to your HREC methodology
Your HREC-approved k parameter is configured in GenomeIQ at deployment. Every subsequent export enforces it automatically - no manual review, no researcher override.
Risk manifest on every export
Each file release includes a signed methodology manifest with suppression counts, re-identification risk score, and NHMRC framework reference. Ready for consortium submission or HREC audit.
You cannot anonymise a genome.
You can only control the risk.
Genomic data is intrinsically identifying. Removing metadata - sample ID, DOB, clinic reference - reduces the risk but does not eliminate it. Population-level population-level controls, rare-variant suppression, and lineage tracking are required. GenomeIQ applies all three.
The GenomeIQ™ pipeline.
What GenomeIQ™ covers.
Population-level risk controls
GenomeIQ applies population-level re-identification risk controls to every file. Singleton variants - those present in fewer than k individuals in the cohort - are suppressed before release. Pedigree relationships are de-linked. The configurable k-threshold aligns to your HREC-approved methodology.
Header & metadata scrub
Sample ID, clinic reference, date fields, and phenotype annotations pseudonymised or stripped.
Singleton suppression
Variants present in fewer than k individuals suppressed. Threshold configurable per HREC methodology.
Policy-gated routing
Research archive and AI vendor access governed per data-use agreement. Enforced at network layer.
Lineage and audit trail
Every processing decision signed and logged. Re-identification risk score included in manifest.
Designed for reference-grade genomic pipelines.
Things genomics teams usually ask us.
Does GenomeIQ satisfy HREC de-identification requirements?
For most ethics applications, yes. GenomeIQ produces a structured risk manifest documenting your k-threshold, suppression counts, and methodology alignment to the NHMRC Genomic Data Framework. Most HRECs accept this in place of a narrative de-identification description. We recommend reviewing the manifest format with your HREC coordinator before submission.
Why do genomic exports need population-level controls?
Genomic data is uniquely re-identifiable. A small number of variants present in only one or two individuals in a cohort can narrow identification to that group - even without a name or date of birth. GenomeIQ applies population-level controls before release: rare variants below a configured frequency are suppressed, sample identifiers are pseudonymised, and family-link metadata is removed. Standard DICOM or HL7 de-identification does not address this.
How does GenomeIQ handle data shared with genomics consortiums?
Each consortium receives a policy-governed export. Your data-use agreement with the consortium is encoded as a GenomeIQ policy - specifying cohort criteria, permitted use, and k-threshold. The export is delivered with a signed manifest the consortium can reference for their own governance obligations. Raw files never leave your sequencing environment.
We have a family study - relationships across individuals are essential. Does GenomeIQ handle this?
Yes. GenomeIQ separates family-link suppression from family-link destruction. The exported dataset has family relationship identifiers removed. A pseudonymous link map is retained under controlled conditions - accessible to authorised researchers for re-link under a separate data-access agreement. Your collaborators get what they need; the raw family graph stays under your governance.
How do we quantify re-identification risk for our ethics submission?
GenomeIQ produces a re-identification risk score for every export, based on cohort size, variant suppression rate, and k-threshold relative to population reference panels. The score is included in the signed manifest alongside the methodology reference. Most ethics committees treat this as sufficient quantitative evidence of risk control.
What k-threshold should we set?
Australian guidance recommends conservative population thresholds for open or controlled-access datasets, with restricted-access thresholds for datasets shared with named collaborators under a formal data-access agreement. GenomeIQ's thresholds are configured during deployment based on your HREC-approved methodology - locked to that document and not adjustable by individual researchers.