The variant is stripped.
The genome is not.
Australian Genomics, Garvan Institute, and the research partners pitching you consortium access all want the same thing: VCF and BAM files that are genuinely safe to share. The problem is that genomic data is inherently re-identifiable. Removing the sample ID reduces risk - it does not eliminate it. GenomeIQ applies population-level k-anonymity enforcement, singleton suppression, and pedigree de-linking before any file moves outside your sequencing environment.
What GenomeIQ replaces.
Everything you are currently doing to satisfy your HREC - and hoping is enough.
Three things that change immediately.
Deployed in your sequencing environment
GenomeIQ Agent installed on-prem, connected to your LIMS or sequencer output directory. No raw genomic data touches external infrastructure. Half-day engagement.
k-threshold locked to your HREC methodology
Your HREC-approved k parameter is configured in GenomeIQ at deployment. Every subsequent export enforces it automatically - no manual review, no researcher override.
Risk manifest on every export
Each file release includes a signed methodology manifest with suppression counts, re-identification risk score, and NHMRC framework reference. Ready for consortium submission or HREC audit.
You cannot anonymise a genome.
You can only control the risk.
Genomic data is intrinsically identifying. Removing metadata - sample ID, DOB, clinic reference - reduces the risk but does not eliminate it. Population-level k-anonymity controls, singleton suppression, and lineage tracking are required. GenomeIQ applies all three.
The GenomeIQ pipeline.
What GenomeIQ covers.
k-anonymity risk controls
GenomeIQ applies population-level re-identification risk controls to every file. Singleton variants - those present in fewer than k individuals in the cohort - are suppressed before release. Pedigree relationships are de-linked. The configurable k-threshold aligns to your HREC-approved methodology.
Header & metadata scrub
Sample ID, clinic reference, date fields, and phenotype annotations pseudonymised or stripped.
Singleton suppression
Variants present in fewer than k individuals suppressed. Threshold configurable per HREC methodology.
Policy-gated routing
Research archive and AI vendor access governed per data-use agreement. Enforced at network layer.
Lineage and audit trail
Every processing decision signed and logged. Re-identification risk score included in manifest.
Designed for reference-grade genomic pipelines.
Things genomics teams usually ask us.
Does GenomeIQ satisfy HREC de-identification requirements?
For most ethics applications, yes. GenomeIQ produces a structured risk manifest documenting your k-threshold, suppression counts, and methodology alignment to the NHMRC Genomic Data Framework. Most HRECs accept this in place of a narrative de-identification description. We recommend reviewing the manifest format with your HREC coordinator before submission.
What is k-anonymity and why does it matter for genomic data?
k-anonymity means that any individual's record is indistinguishable from at least k−1 others in the dataset. For genomic data, this is applied at the variant level: a variant present in fewer than k individuals in your cohort can be used to narrow identification to a small group - even without a name or date of birth. GenomeIQ suppresses those variants before the file is released. Standard DICOM or HL7 de-identification does not address this.
How does GenomeIQ handle data shared with Australian Genomics or similar consortiums?
Each consortium receives a policy-governed export. Your data-use agreement with the consortium is encoded as a GenomeIQ policy - specifying cohort criteria, permitted use, and k-threshold. The export is delivered with a signed manifest the consortium can reference for their own governance obligations. Raw files never leave your sequencing environment.
We have a family study - pedigree relationships are essential to our research. Does GenomeIQ handle this?
Yes. GenomeIQ separates pedigree de-linking from pedigree destruction. The exported dataset has family relationship identifiers removed. A pseudonymous pedigree map is retained under controlled conditions - accessible to authorised researchers for re-link under a separate data-access agreement. Your collaborators get what they need; the raw family graph stays under your governance.
How do we quantify re-identification risk for our ethics submission?
GenomeIQ produces a re-identification risk score for every export, based on cohort size, variant suppression rate, and k-threshold relative to population reference panels. The score is included in the signed manifest alongside the methodology reference. Most ethics committees treat this as sufficient quantitative evidence of risk control.
What k-threshold should we set?
The NHMRC Genomic Data Framework recommends k=100 as a default for open or controlled-access datasets. For restricted-access datasets shared with named collaborators under a formal data-access agreement, k=20 or k=50 may be appropriate. We configure your k-threshold during deployment based on your HREC-approved methodology - it is locked to that document and not adjustable by individual researchers.