Help

GRIMMARD performs HLA imputation: given an individual's HLA typing (complete or partial, high- or low-resolution) together with the population(s) they belong to, it returns the most probable full genotypes, the haplotype pairs that compose them, and the individual haplotype probabilities — each with an associated probability. It combines py-ard (which normalizes the many ways HLA typing can be written into a single standard form) with the graph-based imputation engine ML-GRIM (the imputation component of GRIMM-II).

This page explains every field on the Home form, how to write the typing input, and how to read the results. New users can jump straight to the worked cases on the Example page.

1. What you provide

The Home form has three parts, filled top to bottom:

  1. Output loci — which loci you want the imputation to return.
  2. Population (race) — the population frequencies used to score solutions.
  3. Typing — the observed HLA alleles, entered as a GL‑string or in the per‑allele boxes.

Then press Submit form.

2. Output loci

GRIMMARD supports up to nine loci. Tick the loci you want included in the imputed result:

GroupLoci
HLA class IA, B, C
HLA class IIDRB1, DQA1, DQB1, DRB3/4/5, DPA1, DPB1

3. Population (race)

HLA alleles do not occur independently; they travel together on haplotypes whose frequencies differ markedly between populations. Imputation therefore needs a population so it can assign probabilities. Choosing a population that matches the individual is the single biggest factor in getting accurate, well-ranked results.

GRIMMARD uses the NMDP / Be The Match operational categories: five broad populations, each subdividing into more specific detailed populations. Use a detailed population when you know it; otherwise fall back to the broad one.

BroadDetailed codeDescription
AFA
African American
AFAAfrican American (broad)
AAFAAfrican American
AFBAfrican
CARBBlack Caribbean
SCAMBBlack, South or Central American
API
Asian or
Pacific Islander
APIAsian or Pacific Islander (broad)
AINDISouth Asian Indian
FILIIFilipino
HAWIHawaiian or other Pacific Islander
JAPIJapanese
KORIKorean
NCHIChinese
SCSEAISoutheast Asian
VIETVietnamese
CAU
Caucasian
CAUCaucasian (broad)
EURCAUWhite European
MENAFCMiddle Eastern or North Coast of Africa
HIS
Hispanic
HISHispanic (broad)
CARHISHispanic Caribbean
MSWHISMexican or Chicano
SCAHISHispanic, South or Central American
NAM
Native American
CARIBICaribbean Indian
AMINDNorth American Indian
AISCAmerican Indian, South or Central American
ALANAMAlaska Native or Aleut
Tips on choosing a population

4. Entering the typing

You can enter the observed alleles in two equivalent ways.

4a. GL‑string (recommended)

A Genotype List string (GL‑string) is the standard text format for HLA typing (Milius et al., 2013). It is built from allele names and a small set of operators:

OperatorMeaningExample
* and :Separate the locus, allele family and protein fields of an allele nameA*02:01
+Joins the two alleles at one locus (the two chromosomes)A*02:01+A*24:02
^Separates one locus from the nextA*02:01+A*24:02^B*40:01+B*57:01
/Allelic ambiguity: the allele is one of several possibilitiesA*02:01/A*02:02

A typical well-formed five-locus GL‑string looks like this:

A*02:01+A*24:02^C*03:03+C*07:01^B*40:01+B*57:01^DRB1*04:01+DRB1*07:01^DQB1*03:02+DQB1*03:03

GRIMMARD also accepts lower-resolution and ambiguous input, including:

py-ard normalizes all of these to a common form before imputation, so you can paste typing exactly as it arrives from the lab.

4b. Per‑allele boxes

If you prefer not to write a GL‑string, fill the allele boxes on the Home form. They are laid out two rows per locus group, one row per chromosome:

RowBoxes
Class I, chromosome 1A1 B1 C1
Class I, chromosome 2A2 B2 C2
DR / DQ, chromosome 1DRB1 DQB1 DRB3 DRB4 DRB5
DQ / DRB3-4-5, chromosome 2DRB1 DQB1 DRB3 DRB4 DRB5
DP / DQA1, chromosome 1DPB1 DPA1 DQA1
DP / DQA1, chromosome 2DPB1 DPA1 DQA1

Leave a box empty for any allele you have not typed. The two methods are interchangeable; use whichever is more convenient.

5. Partial and ambiguous typing

6. Null alleles and DRB3/4/5

7. How the imputation works (in brief)

ML-GRIM uses a fast two-stage procedure. First a blocking stage runs classical graph-based imputation on the (up to three) most informative typed loci, producing every genotype consistent with those loci. Each candidate is then checked for consistency with the remaining typed loci and the missing loci are filled in from the population haplotype frequencies. This keeps memory and run time low while guaranteeing that no consistent genotype is missed. Typical run time is well under one second. Probabilities are computed from the population frequencies and normalized so that the genotype probabilities for the individual sum to 1. By default the most probable solutions are returned (the engine considers a large candidate set and reports the top-ranked genotypes).

8. Reading the results

After you submit, the results are organized into tabs:

TabWhat it shows
Genotypes Ranked list of complete unphased genotypes consistent with your input, each with its probability and frequency. The probabilities are normalized to sum to 1 across the returned genotypes.
Haplotype couples The phased haplotype pairs (the two chromosomes) that make up the genotypes, each with its probability. One unphased genotype can arise from several haplotype pairings.
Haplotype agents The probability (frequency) of each individual haplotype, reported separately for each population you selected — useful for seeing how strongly each population supports a given haplotype.

A higher probability means a solution is more consistent with the typing under the chosen population frequencies. If the true genotype is rare or the typing is sparse, probability mass spreads over many candidates and the top result may carry only a modest probability — this is expected and reflects genuine ambiguity, not an error.

9. Troubleshooting

SymptomLikely cause / fix
No results / empty output The typing may be internally inconsistent, or no haplotype with those alleles exists in the chosen population's frequencies. Re-check the alleles and try the broad population code, or add the population that fits the individual.
An allele is rejected Check spelling and format (Locus*field:field, e.g. DRB1*04:01). Very new or non-standard allele names may not be in the reference; try a G/P group or MAC equivalent.
A requested locus is missing from the output That locus is not covered by the selected population's frequencies, or it was not ticked under output loci.
Top genotype has low probability Normal for sparse or low-resolution typing. Type more (or more informative) loci, or at higher resolution, to concentrate the probability.

10. Citation and source code

If you use GRIMMARD in your work, please cite the GRIMM-II paper (Kirshenboim et al.) describing ML-GRIM and ML-GRMA, together with the underlying GRIMM framework (Maiers et al., 2019; Israeli et al.). For questions, contact louzouy@math.biu.ac.il.