WO2015026967A1

Movatterモバイル変換

Info

Publication number: WO2015026967A1
Application number: PCT/US2014/051926
Authority: WO
Inventors: Allison Ryan; Matthew Rabinowitz; Sallie MCADOO; Styrmir Sigurjonsson
Original assignee: Natera, Inc.
Priority date: 2013-08-20
Filing date: 2014-08-20
Publication date: 2015-02-26

Abstract

One embodiment is a method of evaluating a fetus for genetic abnormalities (or the risk of genetic abnormalities), comprising, obtaining a blood sample from a pregnant human subject, measuring the fetal fraction of cell free fetal DNA in the blood sample, determining if the fetal fraction of cell free fetal DNA in the blood sample is within the lowest 1 percentile of a reference population, and generating a report indicating that an invasive genetic analysis procedure should be performed. In some embodiments the reference population is matched for gestational age or maternal weight of both gestational age and maternal weight. In some embodiments the test population is matched for gestational age or maternal weight of both gestational age and maternal weight. Genetic abnormalities that can be detected include aneuploidy.

Description

Methods of Using Low Fetal Fraction Detection

Field

This invention is in the field of prenatal diagnostics. Background

A human being normally has two sets of 23 chromosomes in every somatic cell, with one copy coming from each parent. Aneuploidy, a state where a cell has the wrong number of chromosomes, is responsible for a significant percentage of children born with genetic conditions. Detection of chromosomal abnormalities can identify individuals, including fetuses or embryos, with conditions such as Down syndrome, Edwards syndrome, Klinefelters syndrome, and Turner syndrome, among others. Since chromosomal abnormalities are generally undesirable, the detection of such a chromosomal abnormality in a fetus may provide the basis for the decision to terminate a pregnancy.

Prenatal diagnosis can alert physicians and parents to abnormalities in growing fetuses. Some currently available methods, such as amniocentesis and chorionic villus sampling (CVS), are able to diagnose genetic defects with high accuracy; however, they may carry a risk of spontaneous abortion. Other methods can indirectly estimate a risk of certain genetic defects non-invasively, for example from hormone levels in maternal blood and/or from ultrasound data, however their accuracies are much lower. It has recently been discovered that cell-free fetal DNA and intact fetal cells can enter maternal blood circulation. This provides an opportunity to directly measure genetic information about a fetus, specifically the aneuploidy state of the fetus, in a manner which is non-invasive, for example from a maternal blood draw.

Methods of detecting genetic abnormalities in a fetus based on the analysis of cell free DNA in maternal can involve the step of determining the relative amounts of cell free fetal DNA and cell free maternal DNA, i.e., the fetal fraction, in the maternal blood for analysis.

Summary

One embodiment is a method of evaluating a fetus for genetic abnormalities (or the risk of genetic abnormalities), comprising, obtaining a blood sample from a pregnant human subject, measuring the fetal fraction of cell free fetal DNA in the blood sample, determining if the fetal fraction of cell free fetal DNA in the blood sample is within the lowest 1 percentile of a reference population, and generating a report indicating that an invasive genetic analysis procedure should be performed.

Another embodiment is a method of evaluating a fetus for genetic abnormalities (or the risk of genetic abnormalities), comprising,

obtaining a blood sample from a pregnant human subject, measuring the fetal fraction of the cell free fetal DNA in the blood sample, determining if the ratio of cell free fetal DNA in the blood sample to maternal cell free DNA in the blood sample is in lowest 1 percentile of a reference population, and performing an invasive genetic test if the fetal fraction is in lowest 1 percentile of a reference population.

obtaining a plurality blood samples from a test population of pregnant human subjects, measuring the fetal fraction of the cell free fetal DNA in the blood samples, determining a population subset consisting of members of the test population that have fetal fraction that is in the lowest 1 percentile of the test population, and performing an invasive genetic test on at least one member of the population subset.

Another embodiment is a method of evaluating a fetus for genetic abnormalities (or the risk of genetic abnormalities) , comprising, obtaining a plurality blood samples from a test population of pregnant human subjects, measuring the fetal fraction of the cell free fetal DNA in the blood sample to maternal cell free DNA in the blood samples, determining a population subset consisting of members of the test population that have fetal fraction that is in the lowest 1 percentile of the test population, and performing a test for a detached placenta on at least one member of the population subset.

In some embodiments the reference population is matched for gestational age or maternal weight of both gestational age and maternal weight. In some embodiments the test population is matched for gestational age or maternal weight of both gestational age and maternal weight. Genetic abnormalities that can be detected include aneuploidy, such as trisomies.

Description

Applicants have discovered that a low fetal fraction is indicative of genetic abnormalities in the fetus. Applicants have also discovered that a low fetal fraction is indicative of a detached placenta.

In some embodiments, rather than employing a reference population, the fetal fraction is measured in the blood sample obtained from the patient (i.e. pregnant woman being tested) and subsequently compared against the larger test population from which fetal fraction is measured, so as to provide for the determination of what percentile of the test population the fetal fraction lies. It will be understood by the person of ordinary that by determining how often a lower fetal fraction value measured in the patient is found in the test population, will give the same result as determining how often a higher fetal fraction value measured in the patient is found in the test population. For example if a fetal fraction is found to be 6% in the patient, the fetal fraction 99% of the test population has a higher fetal fraction and 1% of the test population has a lower fetal fraction. It order to avoid ambiguity, the term percentile will be used to indicate where in the test population the observed fetal fraction measurement lies. Thus if a measured fetal fraction is found to be exceeded by 99% of the test population, the fetal fraction is said to be in the lowest 1% of the test population. The term "fetal fraction" as used herein refers to the relative quantity of cell free fetal DNA to cell free maternal DNA plus cell free fetal DNA. Fetal fraction is expressed herein in term of percent.

The subject methods may be used with a variety of reference populations.

The reference population is composed of individuals who are as closely matched with respect to the following parameters, gestational age, maternal weight. Such parameters are known to affect the level of fetal fraction. In general, fetal fraction increases with gestational age. Similarly, fetal faction decreases with increasing weight of the mother. In some embodiments, the reference population will be matched with test patient for gestational age of the fetus. In some embodiments, the reference population will be matched for the weight of the mother. In some embodiments, the reference population will be matched with test patient for gestational age of the fetus and the weight of the mother. In some embodiments, the reference population will be further matched for parameters such as maternal ages, ethnicity, and the like.

It will be appreciated that in order to match population parameters, in some embodiments the binning of the parameter into different ranges rather than relying on a perfectly matched population. Thus, for example a patient carrying a 10 week fetus may be compared with a reference population of women carrying fetuses that 10-12 weeks old, or 9-11 weeks old, or 10-15 weeks old. Similarly, for example, a patient weighing 70 kilograms may be compared a reference population of women that are 65-75 kilograms, or 70-80 kilograms, or 65-80 kilograms.

The size of the reference population, may vary substantially. In accordance with basic principles of statistics, the larger the reference population, the more meaningful the indication of the condition that is being detected, i.e., genetic abnormalities or detached placenta. The reference population should have at least 25 members, more preferably at least 50 members, more preferably at least 75 members, and even more preferably at least 100 members.

The subject methods may employ any of a variety of methods of measuring fetal fraction and is not limited to a specific method. Example of such methods of determining fetal fraction include using a high throughput DNA sequencer to count alleles at large number of polymorphic genie loci and modeling the likely fetal fraction (see for example US patent application 13/300,235; PCT application PCT/11/61506; US patent application 13/110,685; US published patent application 2013/0165203A1; US published patent application 2012/0264121 Al). A method calculating fetal fraction can be found in Sparks et al. American Journal of Obstetrics and Gynecology 319. el (April 2012). Fetal fraction may be determine using a methylation assay (see US patents 7,754,428 B2; 7,901,884 B2; 8,166,382 B2) that assumes certain loci are methylated or preferentially methylated in the fetus, and those same loci are unmethylated or preferentially unmethylated in the mother.

Although it is possible to utilize a wide range of fetal fraction measurement techniques to measure fetal fraction when determining where in the frequency range a given fetal fraction measurement from a test subject lies, it is preferable (although not necessary) that the fetal fraction data used to create the reference population is obtained by the same method as the method used to measure fetal fraction in the patient. Thus in some embodiments the method of determining fetal fraction used for the reference population is different from the method of determining fetal fraction used for the individual patient.

In those embodiments of the invention employing a test population, fetal fraction is determined in essentially the same manner for all constituents of the test population. The test population comprises a set of patients that are being co-analyzed in some way, e.g., sam laboratory, same protocol, same study, and the like, and thus does not require comparison to a set of a priori generated fetal fraction data. In general, the larger the test population, the more meaningful results. A test population preferably comprises at least 100 members. The subject methods may be used with a variety of reference populations.

The test population is composed of individuals who are as closely matched with respect to the following parameters, gestational age, maternal weight. Such parameters are known to affect the level of fetal fraction. In general, fetal fraction increases with gestational age. Similarly, fetal faction decreases with increasing weight of the mother. In some embodiments, the test population will be matched for gestational age of the fetus. In some embodiments, the test population will be matched for the weight of the mother. In some embodiments, the test population will be matched for gestational age of the fetus and the weight of the mother. In some embodiments, the test population will be further matched for parameters such as maternal ages, ethnicity, and the like.

It will be appreciated that in order to have a test population of sufficient size is may be necessary to expand the range of the parameter of interest rather than relying on a perfectly matched test population. Thus, for example, rather than have a test population of subjects having fetuses with a gestational age of 10 weeks, example a test population may consist of subjects with fetuses that are 10-12 weeks old, or 9-11 weeks old, or 10- 15 weeks old. Similarly, test population of make consist of women that weigh 65-75 kilograms, or 70-80 kilograms, or 65-80 kilograms.

The subject methods may be used to test for increased risk of a wide variety of genetic abnormalities that produce low fetal fraction. Such abnormalities include, aneuploidy, deletions, translocations, insertions, and point mutations. Exemplary of such genetic abnormalaities are trisomy 21 (Down syndrome), trisomy 18 (Edwards syndrome), trisomy 13 (Patau Syndrome), 45,X (Turner syndrome), an unbalanced translocation on chromosome 10.

In one embodiment of the invention, blood samples from pregnant women where the fetal fraction in the maternal plasma is abnormally low, is predictive of a fetal abnormality, and appropriate follow up measures may be taken, for example, an invasive genetic testing procedure such as chrionic villus biopsy or amniocentesis, thereby providing for the possibility of diagnosing an abnormal karyotype or other abnormality in the fetus. In one embodiment, abnormally low may mean in the lowest half percentile for fetal fraction, the lowest percentile for fetal fraction, the lowest two percentiles for fetal fraction, the lowest three percentiles for fetal fraction, the lowest four percentiles for fetal fraction, or the lowest five percentiles for fetal fraction. In one embodiment, the percentiles may be determined using fetal fraction distribution for all samples in a data set. In one embodiment , the percentiles may be determined using fetal fraction distribution for only euploid samples in a data set. In one embodiment the percentiles may be determined using fetal fraction distributions that are adjusted for maternal weight, gestational age, any other factor correlated with fetal fraction, or a combination thereof. In one embodiment, the fetal abnormality may include trisomy 13, trisomy 18, triploidy, detatched placenta, a trisomy, other whole chromosome abnormalities, an unbalanced translocation, a microdeletion of up to 20 Mb, a microduplication of up to 20 Mb, a genetic defect, or combination thereof. In some embodiments, the fetal abnormality may indicate the presence of a condition that is threatening to the health of the mother, such as pre-eclampsia.

In an embodiment of the present disclosure, this may entail making a measurement of the mixed sample to determine the fraction of fetal DNA in the mixture; this estimation of the fetal fraction may be done with sequencing, it may be done with TaqMan, it may be done with qPCR, it may be done with SNP arrays, it may be done with any method that can distinguish different alleles at a given loci. In one embodiment the estimation for fetal fraction may be done using a methylation assay (see US patents 7,754,428 B2; 7,901,884 B2; 8,166,382 B2) that assumes certain loci are methylated or preferentially methylated in the fetus, and those same loci are unmethylated or preferentially unmethylated in the mother. The method can measure the degree of methylation at those loci and infer the percent fetal DNA that must be in the sample. The need for a fetal fraction estimate may be eliminated by including hypotheses that cover all or a selected set of fetal fractions in the set of hypotheses that are considered when comparing to the actual measured data. After the fraction fetal DNA in the mixture has been determined, the number of sequences to be read for each sample may be determined.

Parental contexts. SNPs tend to be dimorphic, that is, one of two possible alleles tend to be observed in the population. One may assign the letters A and B to each of the two alleles at a SNP. Each person typically has two copies of each chromosome, and therefore, two copies of each allele. There are examples where this is not true, for example, an individual with Down syndrome has three copies of chromosome 21, and a male will have one copy of each of chromosome X and Y. If an individual has two copies of the A allele, or two copies of the B allele, the homozygous at that allele, and this may be written as A A or BB. If that individual has one copy of each allele, they are heterozygous at that allele, and this may be written as AB. For the purposes of this discussion, (AAIAB) may be used to refer to the parental context at a given SNP where the mother is AA - homozygous, and the fetus is AB - heterozygous.

In one embodiment, the fetal fraction may be estimated by targeted sequencing, wherein a plurality of single nucleotide polymorphisms are preferentially enriched and/or selectively amplified. The fetal fraction can be most easily estimated by looking at those loci where the mother is homozygous and the fetus is heterozygous, that is, the parental context (AAIAB), which is equivalent by symmetry to the context (BBIAB). One may also use other parental contexts, for example, those loci where the mother is heterozygous and the fetus is homozygous. In a preferred embodiment, those sequencing reads that correspond to SNPs where the mother is homozygous and the fetus is heterozygous are quantified for each allele for each SNP. For the purpose of this discussion one may define fetal fraction X to mean that X% of the DNA in the sample is of fetal origin, and 100% - X% of the DNA in the sample is of maternal origin. At these loci where the parental context is (AAIAB), one may presume that for a fetal fraction of X, about ½ X% of the reads from that SNP will indicate the presence of the B allele, and 100% - ½ X% reads will indicate the presence of the A allele. Therefore, if the average fraction of reads that map to the B allele for each SNP from the set of SNPs in the (AAIAB) parental context is F, then the fetal fraction, X = 2F. Alternately, or in combination, one could use the parental context (ABIAA) or (ABIBB). In the case of SNPs from the (ABIAA) context, the fraction of reads from the A allele is expected to be 50% + ½ X%, and the fraction of reads from the B allele is expected to be 50% = ½ X%. Thus if, for all SNPs from the (ABIAA) context, the average percent of reads mapping to the A allele is G, then the fetal fraction is about 2 x (G-50%).

Ploidy Calling Informatics Methods

Described herein is a method for determining the ploidy state of a fetus given sequence data. In some embodiments, this sequence data may be measured on a high throughput sequencer. In some embodiments, the sequence data may be measured on DNA that originated from free floating DNA isolated from maternal blood, wherein the free floating DNA comprises some DNA of maternal origin, and some DNA of fetal / placental origin. This section will describe one embodiment of the present disclosure in which the ploidy state of the fetus is determined assuming that fraction of fetal DNA in the mixture that has been analyzed is not known and will be estimated from the data. It will also describe an embodiment in which the fraction of fetal DNA ("fetal fraction") or the percentage of fetal DNA in the mixture can be measured by another method, and is assumed to be known in determining the ploidy state of the fetus. In some embodiments the fetal fraction can be calculated using only the genotyping measurements made on the maternal blood sample itself, which is a mixture of fetal and maternal DNA. In some embodiments the fraction may be calculated also using the measured or otherwise known genotype of the mother and/or the measured or otherwise known genotype of the father. In another embodiment ploidy state of the fetus can be determined solely based on the calculated fraction of fetal DNA for the chromosome in question compared to the calculated fraction of fetal DNA for the reference chromosome assumed disomic.

In the preferred embodiment, suppose that, for a particular chromosome, we observe and analyze N SNPs, for which we have:

• Set of NR free floating DNA sequence measurements S=(SI,. . . ,SN ). Since this method utilizes the SNP measurements, all sequence data that corresponds to non-polymorphic loci can be disregarded. In a simplified version, where we have (A,B) counts on each SNP, where A and B correspond to the two alleles present at a given locus, S can be written as S=((ai,bi),. . . ,(aN, bisr)), where a; is the A count on SNP i, bi is the B count on SNP i, and

+ bi = NR

• Parent data consisting of

o genotypes from a SNP microarray or other intensity based genotyping platform: mother M=(mi,... ,ΙΊΙΝ), father F=(fi, .. . , ff , where mi, fi £(AA,AB, BB).

o AND/OR sequence data measurements: NRM mother measurements SM=(smi,. . . ,smnrm), NRF father measurements SF=(sfi,...,sf_nrf). Similar to the above simplification, if we have (A,B) counts on each SNP SM=((ami,bmi),...,(am_N, bm_N)), SF=((afi,bfi),...,(af_N, bf_N))

Collectively, the mother, father child data are denoted as D = (M,F,SM,SF,S). Note that the parent data is desired and increases the accuracy of the algorithm, but is NOT necessary, especially the father data. This means that even in the absence of mother and/or father data, it is possible to get very accurate copy number results.

It is possible to derive the best copy number estimate (H^*) by maximizing the data log likelihood LIK(D|H) over all hypotheses (H) considered. In particular it is possible to determine the relative probability of each of the ploidy hypotheses using the joint distribution model and the allele counts measured on the prepared sample, and using those relative probabilities to determine the hypothesis most likely to be correct as follows:

H^* = argmax LIK(D|H)

H

Where priorprob(H) is the prior probability assigned to each hypothesis H, based on model design and prior knowledge.

It is also possible to use priors to find the maximum a posteriori estimate:

H_MA = argmax LIK(D I H)

H

In an embodiment, the copy number hypotheses that may be considered are:

• Monosomy:

o maternal H10 (one copy from mother)

o paternal HOI (one copy from father)

• Disomy: HI 1 (one copy each mother and father)

• Simple trisomy, no crossovers considered:

o Maternal: H21_matched (two identical copies from mother, one copy from father), H21_unmatched (BOTH copies from mother, one copy from father) o Paternal: H12_matched (one copy from mother , two identical copies from father), H12_unmatched (one copy from mother , both copies from father)

• Composite trisomy, allowing for crossovers (using a joint distribution model):

o maternal H21 (two copies from mother, one from father),

o paternal H12 (one copy from mother, two copies from father)

In other embodiments, other ploidy states, such as nullsomy (H00), uniparental disomy (H20 and H02), and tetrasomy (H04, H13, H22, H31 and H40), may be considered.

If there are no crossovers, each trisomy, whether the origin was mitotis, meiosis I, or meiosis II, would be one of the matched or unmatched trisomies. Due to crossovers, true trisomy is usually a combination of the two. First, a method to derive hypothesis likelihoods for simple hypotheses is described. Then a method to derive hypothesis likelihoods for composite hypotheses is described, combining individual SNP likelihood with crossovers.

LIK(D\H) for a Simple Hypothesis

In an embodiment, LIK(DIH) may be determined for simple hypotheses, as follows. For simple hypotheses H, LIK(H), the log likelihood of hypothesis H on a whole chromosome, may be calculated as the sum of log likelihoods of individual SNPs, assuming known or derived child fraction cf. In an embodiment it i data.

This hypothesis does not assume any linkage between SNPs, and therefore does not utilize a joint distribution model.

In some embodiments, the Log Likelihood may be determined on a per SNP basis. On a particular SNP i, assuming fetal ploidy hypothesis H and percent fetal DNA cf, log likelihood of observed data D is defined as: f, c, H, cf, i)P(c|m, f, H)P(m|i)P(f|i)

where m are possible true mother genotypes , f are possible true father genotypes, where m,f E {ΑΑ,ΑΒ,ΒΒ }, and c are possible child genotypes given the hypothesis H. In particular, for monosomy c E {A, B}, for disomy c E {AA, AB, BB}, for trisomy c E {AAA, AAB, ABB, BBB}.

Genotype prior frequency: p(mli) is the general prior probability of mother genotype m on SNP i, based on the known population frequency at SNP I, denoted pAi. In particular

p (AA

= (1 - pA_t)²

Father genotype probability, p(fli), may be determined in an analogous fashion.

True child probability: p (c\m, f, H) is the probability of getting true child genotype = c, given parents m, f, and assuming hypothesis H, which can be easily calculated. For example, for Hl l, H21 matched and H21 unmatched, p(clm,f,H) is given below.

Data likelihood: P(D|m, f, c, H, i, cf) is the probability of given data D on SNP i, given true mother genotype m, true father genotype f, true child genotype c, hypothesis H and child fraction cf. It can be broken down into the probability of mother, father and child data as follows:

Mother SNP array data likelihood: Probability of mother SNP array genotype data m; at SNP i compared to true genotype m, assuming SNP array genotypes are correct, is simply

P(M m, i) =

1 ' L (0 rri¹i≠ . m

Mother sequence data likelihood: the probability of the mother sequence data at SNP i, in the case of counts

with no extra noise or bias involved, is the binomial probability defined as P(SMIm,i)=Pxi_m(ami) where Xlm~Binom(p_m(A), ami+bm with p_m(A) defined as

Father data likelihood: a similar equation applies for father data likelihood.

Note that it is possible to determine the child genotype without the parent data, especially father data. For example if no father genotype data F is available, one may just use P(F|f, i) = 1. If no father sequence data SF is available, one may just use P(SFIf,i)=l.

In some embodiments, the method involves building a joint distribution model for the expected allele counts at a plurality of polymorphic loci on the chromosome for each ploidy hypothesis; one method to accomplish such an end is described here. Free fetal DNA data likelihood: P(S|m, c, H, cf, i) is the probability of free fetal DNA sequence data on SNP i, given true mother genotype m, true child genotype c, child copy number hypothesis H, and assuming child fraction cf. It is in fact the probability of sequence data S on SNP I, given the true probability of A content on SNP i μ(η , c, cf, H)

P(S|m, c, H, cf, i) = P(S^(m, c, cf, H), i)

For counts, where Si=(ai,bi), with no extra noise or bias in data involved,

P(S^(m, c, cf, H), i) = P_x(_ai)

where X~Binom(p(A), a;+bi) with p(A)= μ(η , c, cf, H). In a more complex case where the exact alignment and (A,B) counts per SNP are not known, Ρ(5| μ(η , c, cf, H), i) is a combination of integrated binomials. True A content probability: μ(ηι, c, cf, H), the true probability of A content on SNP i in this mother/child mixture, assuming that true mother genotype = m, true child genotype = c, and overall child fraction = cf, is defined as

#i (m) * (1 - cf) + #_4(c) * cf

μ(ηι, c, cf, H) =

n_m * (1 - cf) + n_c * cf

where #A(g) = number of A's in genotype g, n_m = 2 is somy of mother and n_c is ploidy of the child under hypothesis H (1 for monosomy, 2 for disomy, 3 for trisomy).

Using A Joint Distribution Model: LIK(D\H) for a Composite Hypothesis

In some embodiments, the method involves building a joint distribution model for the expected allele counts at the plurality of polymorphic loci on the chromosome for each ploidy hypothesis; one method to accomplish such an end is described here. In many cases, trisomy is usually not purely matched or unmatched, due to crossovers, so in this section results for composite hypotheses H21 (maternal trisomy) and H12 (paternal trisomy) are derived, which combine matched and unmatched trisomy, accounting for possible crossovers.

In the case of trisomy, if there were no crossovers, trisomy would be simply matched or unmatched trisomy. Matched trisomy is where child inherits two copies of the identical chromosome segment from one parent. Unmatched trisomy is where child inherits one copy of each homologous chromosome segment from the parent. Due to crossovers, some segments of a chromosome may have matched trisomy, and other parts may have unmatched trisomy. Described in this section is how to build a joint distribution model for the heterozygosity rates for a set of alleles; that is, for the expected allele counts at a number of loci for one or more hypotheses.

Suppose that on SNP i, LIK(D|Hm, i) is the fit for matched hypothesis H_m, and LIK(D|Hu, i) is the fit for unmatched hypothesis H_u, and pc(i) = probability of crossover between SNPs i-1 andi. One may then calculate the full likelihood as:

LIK(D|H) =∑E LIK(D|E, 1: N)

where LIK(D|E, 1: N) is the likelihood of ending in hypothesis E, for SNPs 1:N. E = hypothesis of the last SNP, E E (Hm, Hu). Recursively, one may calculate:

LIK(D|E, 1: i) = LIK(D|E, i) + log(exp(LIK(D|E, 1: i - 1)) * (l - pc(i))

+ exp(LIK(D|~E, 1: i - 1)) * pc(i)) where ~E is the hypothesis other than E (not E), where hypotheses considered are H_m and H_u. In particular, one may calculate the likelihood of l :i SNPs, based on likelihood of 1 to (i-1) SNPs with either the same hypothesis and no crossover, or the opposite hypothesis and a crossover, multiplied by the likelihood of the SNP i

For SNP 1, i=l, LIK(D | E, 1: 1) = LIK(D | E, 1) .

For SNP 2, i=2, LIK(D | E, 1: 2) = LIK(D | E, 2) + log(exp(LIK(D |E, 1)) * (l - pc(2)) + exp(LIK(D | ~E, 1)) * pc(2)),

and so on for i=3:N.

In some embodiments, the child fraction may be determined. The child fraction may refer to the proportion of sequences in a mixture of DNA that originate from the child. In the context of non-invasive prenatal diagnosis, the child fraction may refer to the proportion of sequences in the maternal plasma that originate from the fetus or the portion of the placenta with fetal genotype. It may refer to the child fraction in a sample of DNA that has been prepared from the maternal plasma, and may be enriched in fetal DNA. One purpose of determining the child fraction in a sample of DNA is for use in an algorithm that can make ploidy calls on the fetus, therefore, the child fraction could refer to whatever sample of DNA was analyzed by sequencing for the purpose of non-invasive prenatal diagnosis.

Some of the algorithms presented in this disclosure that are part of a method of noninvasive prenatal aneuploidy diagnosis assume a known child fraction, which may not always the case. In an embodiment, it is possible to find the most likely child fraction by maximizing the likelihood for disomy on selected chromosomes, with or without the presence of the parental data

In particular, suppose that LIK(DI HI 1, cf, chr) = log likelihood as described above, for the disomy hypothesis, and for child fraction cf on chromosome chr. For selected chromosomes in Cset (usually 1 : 16), assumed to be euploid, the full likelihood is:

The most likely child fraction (c^*)is derived as cf^* = argmax LIK(cf) .

cf

It is possible to use any set of chromosomes. It is also possible to derive child fraction without assuming euploidy on the reference chromosomes. Using this method it is possible to determine the child fraction for any of the following situations: (1) one has array data on the parents and shotgun sequencing data on the maternal plasma; (2) one has array data on the parents and targeted sequencing data on the maternal plasma; (3) one has targeted sequencing data on both the parents and maternal plasma; (4) one has targeted sequencing data on both the mother and the maternal plasma fraction; (5) one has targeted sequencing data on the maternal plasma fraction; (6) other combinations of parental and child fraction measurements.

In some embodiments the informatics method may incorporate data dropouts; this may result in ploidy determinations of higher accuracy. Elsewhere in this disclosure it has been assumed that the probability of getting an A is a direct function of the true mother genotype, the true child genotype, the fraction of the child in the mixture, and the child copy number. It is also possible that mother or child alleles can drop out, for example instead of measuring true child AB in the mixture, it may be the case that only sequences mapping to allele A are measured. One may denote the parent dropout rate for genomic illumina data d_pg, parent dropout rate for sequence data d_ps and child dropout rate for sequence data d_cs. In some embodiments, the mother dropout rate may be assumed to be zero, and child dropout rates are relatively low; in this case, the results are not severely affected by dropouts. In some embodiments the possibility of allele dropouts may be sufficiently large that they result in a significant effect of the predicted ploidy call. For such a case, allele dropouts have been incorporated into the algorithm here:

Parent SNP array data dropouts: For mother genomic data M, suppose that the genotype after the dropout is ma, then

where P(M|m_d, i) = j_n^{1 d} as before, and P(m_d|m) is the likelihood of genotype ma after the possible dropout given the true genotype m, defined as below, for dropout rate d

md

m AA : AB BB A B ! nocall

AA (l-d)^A2 o ; 0 2d(l-d) 0 d^A2

AB 0 (l-d)^A2 0 d(l-d) d(l-d)^; d^A2

BB o; 0 (l-d)^A2 0 2d(l-d) : d^A2

A similar equation applies for father SNP array data.

Parent sequence data dropouts: For mother sequence data SM

P(SM|m, i) = ^ P_X|md (am_i)P(m_d|m)

m_d where P(m_d|m) is defined as in previous section and Px_|md (ami) probability from a binomial distribution is defined as before in the parent data likelihood section. A similar equation applies to the paternal sequence data.

Free floating DNA sequence data dropout:

P(S|m, c, H, cf, i) = V P(S^(m_d, c_d, cf, H), i)P(m_d|m)P(c_d|c)

m_d,c_d

where P(S| μ(η_£ΐ, c_d, cf, H), i) is as defined in the section on free floating data likelihood.

In an embodiment, p(m_d \m) is the probability of observed mother genotype m_d, given true mother genotype m, assuming dropout rate d_ps, and p(c_d |c)is the probability of observed child genotype c_d, given true child genotype c, assuming dropout rate d_cs. If ηΑτ = number of A alleles in true genotype c, ηΑϋ = number of A alleles in observed genotype c_d, where ηΑτ > ηΑϋ, and similarly ηΒτ = number of B alleles in true genotype c, ηΒϋ = number of B alleles in observed genotype c_d, where > ηΒϋ and d = dropout rate, then

P(c_d|c) = * d^nBT~^nBD * (1 - d)^nBD

In an embodiment, the informatics method may incorporate random and consistent bias. In an ideal word there is no per SNP consistent sampling bias or random noise (in addition to the binomial distribution variation) in the number of sequence counts. In particular, on SNP i, for mother genotype m, true child genotype c and child fraction cf, and X = the number of A's in the set of (A+B) reads on SNP i, X acts like a X~Binomial(p, A+B), where p = μ(ηι, c, cf, H) = true probability of A content.

In an embodiment, the informatics method may incorporate random bias. As is often the case, suppose that there is a bias in the measurements, so that the probability of getting an A on this SNP is equal to q, which is a bit different than p as defined above. How much different p is from q depends on the accuracy of the measurement process and number of other factors and can be quantified by standard deviations of q away from p. In an embodiment, it is possible to model q as having a beta distribution, with parameters , β depending on the mean of that distribution being centered at p, and some specified standard deviation s. In particular, this gives X\ q~Bin(q, Di), where q~Beta(a,P). If we let E (q) = p, V(q) = s² , and parameters a, β can be derived as a = pN, β = (1— p)N, where N =—— 1.

This is the definition of a beta-binomial distribution, where one is sampling from a binomial distribution with variable parameter q, where q follows a beta distribution with mean p. So, in a setup with no bias, on SNP i, the parent sequence data (SM) probability assuming true mother genotype (m), given mother sequence A count on SNP i (am;) and mother sequence B count on SNP i (bm may be calculated as:

P(SMIm,i)=Pxim(ami) where Xlm~Binom(p_m(A), ami+bm

Now, including random bias with standard deviation s, this becomes:

Xlm~BetaBinom(p_m(A), ami+bmi,s)

In the case with no bias, the maternal plasma DNA sequence data (S) probability assuming true mother genotype (m), true child genotype (c), child fraction (cf), assuming child hypothesis H, given free floating DNA sequence A count on SNP i (a;) and free floating sequence B count on SNP i (bi) may be calculated as

P(S |m, c, cf, H, i) = P_x(_ai)

where X~Binom(p(A), ai+bi) with p(A)= μ(ηι, c, cf, H) .

In an embodiment, including random bias with standard deviation s, this becomes X~BetaBinom(p(A), ai+bi, s), where the amount of extra variation is specified by the deviation parameter s, or equivalently N. The smaller the value of s (or the larger the value of N) the closer this distribution is to the regular binomial distribution. It is possible to estimate the amount of bias, i.e. estimate N above, from unambiguous contexts AAIAA, BBIBB, AAIBB, BBIAA and use estimated N in the above probability. Depending on the behavior of the data, N may be made to be a constant irrespective of the depth of read a;+bi, or a function of a;+bi, making bias smaller for larger depths of read.

In an embodiment, the informatics method may incorporate consistent per-SNP bias. Due to artifacts of the sequencing process, some SNPs may have consistently lower or higher counts irrespective of the true amount of A content. Suppose that SNP i consistently adds a bias of Wi percent to the number of A counts. In some embodiments, this bias can be estimated from the set of training data derived under same conditions, and added back in to the parent sequence data estimate as:

P(SMIm,i)=Pxim(ami) where Xlm~BetaBinom(p_m(A)+ Wi, ami+bmi,s) and with the free floating DNA sequence data probability estimate as:

P(S|m, c, cf, H, i) = P_x(ai) where X~BetaBinom(p(A)+ Wi,¾+bi,s),

In some embodiments, the method may be written to specifically take into account additional noise, differential sample quality, differential SNP quality, and random sampling bias. An example of this is given here. This method has been shown to be particularly useful in the context of data generated using the massively multiplexed mini-PCR protocol, and was used in Experiments 7 through 13. The method involves several steps that each introduce different kind of noise and/or bias to the final model:

(1) Suppose the first sample that comprises a mixture of maternal and fetal DNA contains an original amount of DNA of size=No molecules, usually in the range 1,000- 40,000, where p = true refs

(2) In the amplification using the universal ligation adaptors, assume that Ni molecules are sampled; usually Ni ~ No/2 molecules and random sampling bias is introduced due to sampling. The amplified sample may contain a number of molecules N₂ where N₂ » Ni. Let Xi represent the amount of reference loci (on per SNP basis) out of Ni sampled molecules, with a variation in pi= Xi/Ni that introduces random sampling bias throughout the rest of protocol. This sampling bias is included in the model by using a Beta-Binomial (BB) distribution instead of using a simple Binomial distribution model. Parameter N of the Beta-Binomial distribution may be estimated later on per sample basis from training data after adjusting for leakage and amplification bias, on SNPs with 0<p<l. Leakage is the tendency for a SNP to be read incorrectly.

(3) The amplification step will amplify any allelic bias, thus amplification bias introduced due to possible uneven amplification. Suppose that one allele at a locus is amplified f times another allele at that locus is amplified g times, where f=ge^b, where b=0 indicates no bias. The bias parameter, b, is centered at 0, and indicates how much more or less the A allele get amplified as opposed to the B allele on a particular SNP. The parameter b may differ from SNP to SNP. Bias parameter b may be estimated on per SNP basis, for example from training data.

(4) The sequencing step involves sequencing a sample of amplified molecules. In this step there may be leakage, where leakage is the situation where a SNP is read incorrectly. Leakage may result from any number of problems, and may result in a SNP being read not as the correct allele A, but as another allele B found at that locus or as an allele C or D not typically found at that locus. Suppose the sequencing measures the sequence data of a number of DNA molecules from an amplified sample of size N₃, where N₃ < N₂. In some embodiments, N₃ may be in the range of 20,000 to 100,000; 100,000 to 500,000; 500,000 to 4,000,000; 4,000,000 to 20,000,000; or 20,000,000 to 100,000,000. Each molecule sampled has a probability p_g of being read correctly, in which case it will show up correctly as allele A. The sample will be incorrectly read as an allele unrelated to the original molecule with probability l-p_g, and will look like allele A with probability p_r, allele B with probabililty p_m or allele C or allele D with probability p₀, where

Parameters p_g, p_r, p_m, p₀ are estimated on per SNP basis from the training data.

Different protocols may involve similar steps with variations in the molecular biology steps resulting in different amounts of random sampling, different levels of amplification and different leakage bias. The following model may be equally well applied to each of these cases. The model for the amount of DNA sampled, on per SNP basis, is given by:

X₃~BetaBinomial(L(F(p,b),p_r,p_g), N*H(p,b))

where p = the true amount of reference DNA, b = per SNP bias, and as described above, p_g is the probability of a correct read, p_r is the probability of read being read incorrectly but serendipitously looking like the correct allele, in case of a bad read, as described above, and:

F(p,b)= pe^b/(pe^b+(l-p)), H(p,b) = (e^bp+(l-p))²/e^b, L(p,p_r,p_g)=p*p_g+p_r*(l-p_g).

In some embodiments, the method uses a Beta-Binomial distribution instead of a simple binomial distribution; this takes care of the random sampling bias. Parameter N of the Beta- Binomial distribution is estimated on per sample basis on an as needed basis. Using bias correction F(p,b), H(p,b), instead of just p, takes care of the amplification bias. Parameter b of the bias is estimated on per SNP basis from training data ahead of time.

In some embodiments the method uses leakage correction L(p,p_r,p_g), instead of just p; this takes care of the leakage bias, i.e. varying SNP and sample quality. In some embodiments, parameters p_g, p_r, p₀ are estimated on per SNP basis from the training data ahead of time. In some embodiments, the parameters p_g, p_r, p₀ may be updated with the current sample on the go, to account for varying sample quality.

The model described herein is quite general and can account for both differential sample quality and differential SNP quality. Different samples and SNPs are treated differently, as exemplified by the fact that some embodiments use Beta-Binomial distributions whose mean and variance are a function of the original amount of DNA, as well as sample and SNP quality.

Examples

Experiment #1.

A set of 197 blood samples were collected from pregnant women determined to be at a high risk of chromosomal abnormalities due to at least one of the following: (1) a risk of greater than 1/100 risk of aneuploidy according to first trimester serum screen, (2) an ultrasound abnormality indicative of a chromosomal abnormality or (3) a maternal age of 39 or greater. A paternal blood sample was also collected for each maternal sample.

Maternal venous blood samples (20 mL in Streck cell-free DNA BCT™ tubes) were obtained before chorionic villus sampling from the 205 singleton pregnancies. The patients gave written informed consent to provide samples for research into early prediction of pregnancy complications according to an IRB approved protocol.

Pregnant couples were enrolled at selected prenatal care centers under Institutional Review Board-approved protocols pursuant to local laws. Women were at least 18 years of age, had a GA of at least 9 weeks, singleton pregnancies, and signed an informed consent. A total of 205 maternal blood samples were drawn, and paternal genetic samples were collected (blood or buccal). The cohort included eighteen T21 (Down syndrome), three T18 (Edwards syndrome), two T13 (Patau Syndrome), one 45,X (Turner syndrome), three samples with triploidy, one sample with an unbalanced translocation on chromosome 10, and 170 samples from women with euploid pregnancies. In total, 28 of the 197 (14.2%) samples were abnormal.

The samples were prepared in the following way: up to 20 mL of maternal blood were centrifuged to isolate the buffy coat and the plasma. The genomic DNA in the maternal sample was prepared from the buffy coat and paternal DNA was prepared from a blood sample or saliva sample. Cell-free DNA in the maternal plasma was isolated using the QIAGEN CIRCULATING NUCLEIC ACID kit and eluted in 50 uL TE buffer according to manufacturer's instructions. Universal ligation adapters were appended to the end of each molecule of 40 uL of purified plasma DNA and libraries were amplified for 9 cycles using adaptor specific primers. Libraries were purified with AGENCOURT AMPURE beads and eluted in 50 ul DNA suspension buffer. 6 ul of the DNA was amplified with 15 cycles of STAR 1 (95 °C for 10 min for initial polymerase activation, then 15 cycles of 96°C for 30s; 65°C for 1 min; 58°C for 6 min; 60°C for 8 min; 65°C for 4 min and 72°C for 30s; and a final extension at 72°C for 2 min) using 7.5 nM primer concentration of 19,488 target- specific tagged reverse primers and one library adaptor specific forward primer at 500 nM.

The hemi-nested PCR protocol involved a second amplification of a dilution of the STAR 1 product for 15 cycles (STAR 2) (95 °C for 10 min for initial polymerase activation, then 15 cycles of 95°C for 30s; 65°C for 1 min; 60°C for 5 min; 65°C for 5 min and 72°C for 30s; and a final extension at 72°C for 2 min) using reverse tag concentration of 1000 nM, and a concentration of 20 nM for each of 19,488 target- specific forward primers.

An aliquot of the STAR 2 products was then amplified by standard PCR for 12 cycles with 1 uM of tag- specific forward and barcoded reverse primers to generate barcoded sequencing libraries. An aliquot of each library was mixed with libraries of different barcodes and purified using a spin column.

In this way, 19,488 primers were used in the single-well reactions; the primers were designed to target SNPs found on chromosomes 1, 2, 13, 18, 21, X and Y. The amplicons were then sequenced using an ILLUMINA GAIIX sequencer. For plasma samples, approximately 10 million reads were generated by the sequencer, with 9.4-9.6 million reads mapping to the genome (94-96 %), and of those, 99.95 % mapped to targeted SNPs with a mean depth of read of 460 and a median depth of read of 350. For comparison, a perfectly even distribution would be: 10M reads / 19,488 targets = 513 reads/target. For primer-dimers, 30,000 reads were from sequenced primer- dimers (0.3% of the reads generated by the sequencer). For genomic samples, 99.4-99.7% of the reads mapped to the genome, of those, 99.99% of the mapped to targeted SNPs, and 0.1% of the reads generated by the sequencer were primer-dimers.

For plasma samples with 10 million sequencing reads, typically at least 19,350 of the 19,488 targeted SNPs (99.3 %) are amplified and sequenced. For DNA samples with 2M sequencing reads, typically at least 19,000 targeted SNPs (97.5%) are amplified and sequenced. The lower number may be due to sampling noise since the number of reads is lower and the sequencer misses some of the amplified products. If desired, the number of sequencing reads can be increased to increase the number of targeted SNPs that are amplified and sequenced.

Relevant maternal and paternal genomic DNA samples amplified using a semi-nested 19,488 outer forward primers and tagged reverse primers at 7.5 nM in the STAR 1. Thermocycling conditions and composition of STAR 2, and the barcoding PCR were the same as for the hemi- nested protocol.

Data analysis

Genome sequence alignment was performed using a proprietary algorithm adapted from the Novoalign (Novocraft, Selangor, Malaysia) commercial software package. A chromosome copy number classification algorithm was implemented inMATLAB (MathWorks, Natick, MA, USA) leveraging a proprietary statistical algorithm termed Parental SupportTM (PS). The technique uses parental genotypes, data from the Hapmap Database, and the observed number of sequence reads associated with each of the relevant alleles at SNP loci. A simplified explanation of the PS method follows and is described in greater detail in the Supporting Information The PS algorithm uses measured parental genotypes and crossover frequency data, to create, in silico, billions of possible monosomic, disomic, and trisomic fetal genotypes at measured loci, each considered as a separate hypothesis. PS then uses a data model that predicts what the sequencing data is expected to look like for a plasma sample containing different fetal cfDNA fractions for each hypothetical fetal genotype. Bayesian statistics are used to determine the relative likelihood of each hypothesis given the data, and likelihoods are summed for each copy number hypothesis family: monosomy, disomy, or trisomy. The hypothesis with the maximum likelihood is selected as the copy number and fetal fraction, and the absolute likelihood of the call is the calculated accuracy, analogous to a test-specific risk score.

Briefly, different probability distributions are expected for each of the two possible alleles at a set of SNPs on the target chromosome depending on the parental genotypes, the fetal fraction, and the fetal chromosome copy number. By comparing the observed allele distributions to the expected allele distributions for each of the possible scenarios, it is possible to determine the most likely scenario and precisely how likely that scenario is. Of the 197 samples, the algorithm determined the ploidy state of 183 of the samples, and a result was not returned for 14 of the samples due to a fetal fraction that was too low. The fetal fraction of the samples that did not return a result ranged from 0.6% to 5.8%.

Based on over 10,000 maternal blood samples run by Natera, expected fetal fraction distributions were modeled for different maternal weight (MW) and gestational age (GA) buckets. For each of the 197 samples, the fetal fraction percentile was calculated, given the maternal weight and gestational age for that sample. Six of the samples were found to have MW and GA adjusted fetal fractions that were in the lowest half -percentile. Of those six samples, one was from a trisomy 21 pregnancy, three were from a triploid pregnancies, and one was from a pregnancy with a fetus with an unbalanced translocation on chromosome 10. Thus 5/6, or 86.7% of the samples in the lowest half-percentile for fetal fraction were abnormal, as compared to 14% of cases in the overall cohort. Therefore, samples with abnormally low fetal fraction were more than five times more likely to be abnormal. A determination of high risk for fetal abnormality should be followed up with further testing designed to confirmation presence of an abnormality. In one embodiment, the follow up may be an invasive procedure such as a chrionic villus biopsy or amniocentesis.

Experiment #2

A number of samples were flagged as having abnormally low fetal fraction. Specifically, they were chosen as being in the lowest half-percentile for fetal fraction, and also from cases with maternal weight below 150 lbs., and gestational age at 14 weeks or below. Follow up was obtained from physicians on 11 of these samples. Four were found to be abnormal, and seven were found to be normal. Two were found have triploid karyotype, one was found to have an ultrasound consistent with trisomy 18 or triploidy, one had a detached placenta. Thus, 4/11 , or 36.4% of cases with abnormally low fetal fraction involved fetal abnormality. As a whole, the data contains about 2% of women with fetal abnormality, therefore, among the women with abnormally low fetal fraction, the risk of a fetal abnormality was about 20 times higher, as compared to the entire cohort.

Example 3

Blood was drawn from 66986 pregnant patients. The blood was subjected to Panorama screening for the detecting of chromosomal abnormalities, i.e., trisomy 21, trisomy 18, trisomy 13, and X monosomy. Ploidy calls were made on 62333 of the samples. 1113 patients were identified as having a high risk of chromosomal abnormality. 60898 were identified as has having a low risk of a chromosomal abnormality. 269 were identified as either having twins or being triploid. 53 were identified as having a sex chromosome trisomy. 4,653 samples did not generate a result, primarily due to the sample having a fraction of fetal DNA that was too low for the algorithm to make a determination; a second sample was requested from patients when the initial sample did not generate a result. A cohort of 313 samples were identified as having a fetal fraction in the lowest one 1 % of the population, as adjusted for maternal weight and gestational age, and suitable for collection of follow-up information. Samples lacking accurate information for maternal weight or gestational ages were not considered in determining the lowest fetal fraction in 1% of the population. Of the 313 samples, 14 provided a resulted on the initial blood draw, all of which were low-risk; 82 provided a result upon a second blood draw, five of which were high-risk and 77 of which were low risk; 109 provided no result upon a second blood draw, and 108 were not submitted for a second blood draw. Follow up contacts of the patients physicians were obtained for 217 of the 313 samples. Of the 217 samples that were low fetal fraction, 8 were confirmed to be aneuploid by a follow up procedure, 7 were found to be likely to be aneuploid by a follow up procedure, 6 were confirmed to be euploid by a follow up procedure, 18 were found to be likely to be euploid by a follow up procedure, no useful information about ploidy state was found for the remaining 178 samples. The data was used to calculate the odd ratio for the likelihood that that a low fetal fraction sample was aneuploid by calculating the ratio of low fetal fraction aneuploid samples to low fetal fraction euploid samples and dividing the ratio by the ratio of normal fetal fraction aneuploid samples to normal fetal fraction euploid samples. The fraction of normal fetal fraction samples that were euploid and aneuploidy were determined by using the NIPT test result as a proxy, thus, the ratio of normal fetal fraction aneuploid samples to normal fetal fraction euploid samples was found to be 1,113 / 60,898 = 1.83%. The calculated odd ratio was 73.1 [(8/6)/(l, 113/60,898)] based on samples were there was post-test follow up to confirm the ploidy status of the low fetal fraction and normal fetal fraction samples. The calculated odd ratio was 7.3 [((8+5)/(6+14+77))/(l,113)/(60,898)] based on samples were there was confirmation either by post-test follow up to confirm the ploidy status of the low fetal fraction and normal fetal fraction samples. In both cases, the odds ratio showed that samples with a fetal fraction in the 1st %ile were considerably more likely to be aneuploidy as compared to samples that had a fetal fraction that was not in the 1st ile.

Claims

Claims What is claimed is:

1. A method of evaluating a fetus for genetic abnormalities, comprising,

obtaining a blood sample from a pregnant human subject,

measuring the fetal fraction of cell free fetal DNA in the blood sample, determining if the fetal fraction of cell free fetal DNA in the blood sample is within the lowest 1 percentile of a reference population, and

generating a report indicating that an invasive genetic analysis procedure should be performed.

2. A method of evaluating a fetus for genetic abnormalities, comprising,

obtaining a blood sample from a pregnant human subject,

measuring the fetal fraction of the cell free fetal DNA in the blood sample, determining if the ratio of cell free fetal DNA in the blood sample to maternal cell free DNA in the blood sample is in lowest 1 percentile of a reference population, and

performing an invasive genetic test if the fetal fraction is in lowest 1 percentile of a reference population.

3. A method of evaluating a fetus for genetic abnormalities, comprising,

obtaining a plurality blood samples from a test population of pregnant human subjects,

measuring the fetal fraction of the cell free fetal DNA in the blood samples, determining a population subset consisting of members of the test population that have fetal fraction that is in the lowest 1 percentile of the test population, and performing an invasive genetic test on at least one member of the population subset.

4. A method of evaluating a fetus for genetic abnormalities, comprising,

measuring the fetal fraction of the cell free fetal DNA in the blood sample to maternal cell free DNA in the blood samples,

determining a population subset consisting of members of the test population that have fetal fraction that is in the lowest 1 percentile of the test population, and performing a test for a detached placenta on at least one member of the population subset.

5. The method of claim 1 and 2, wherein the reference population is matched for gestational age.

6. The method of claim 1 and 2, wherein the reference population is matched for maternal weight.

7. The method of claim 5, wherein the reference population is matched for maternal weight.

8. The method of claim 3 and 4, wherein the test population is matched for gestational age.

9. The method of claim 3 and 4, wherein the test population is matched for maternal weight.

10. The method of claim 9, wherein the test population is matched for maternal weight.

11. The method of claims 1 and 3, wherein the genetic abnormality is aneuploidy.

12. The method of claim 11, wherein the aneuploidy is a trisomy.

13. The method of claim 1 and 2, wherein the fetal fraction is in the lowest 0.5 percentile.

14. The method of claim 3 and 4, wherein the fetal fraction is in the lowest 0.5 percentile.