Basic Genetics

This vignette will introduce you to how the basic genetic parameters like the allele frequency, the genotype frequency and Hardy-Weinberg Equilibrium results are calculated with mixIndependR.

Package Installation

Data Import

The dataset imported should be in a format of the genotype data with individuals in rows and markers in columns. Excel, csv and vcf file format are compatible.

mixexample is an attached data in this package, which is the genotypes of 2504 individuals on 100 unlined variants including 6 STRs and 94 SNPs.

x <- mixexample

Basic Genetic Parameters

AlleleFreq calculates the allele frequencies for one dataset.

p <- AlleleFreq(x,sep = "\\|")

GenotypeFreq calculates the observed or expected genotype frequency. If expect=FALSE, the observed genotype frequencies from the original dataset will be calculated. If expected=TRUE, the expected genotype probabilities from allele frequency table under Hardy-Weinberg Equilibrium will be exported.

G <- GenotypeFreq(x,sep = "\\|",expect = FALSE) 
G0 <- GenotypeFreq(x,sep = "\\|",expect = TRUE) 

Heterozygous test the heterozygosity of each individuals at each locus and output a table with 0 denoting homozygous and 1 heterozygous.

h <-Heterozygous(x,sep = "\\|") ####or Just use Heterozygous(x)

RxpHetero calculate Real or Expected Average Heterozygosity at each locus. If HWE=TRUE, this function will calculate the expected heterozygosities under Hardy-Weinberg Equilibrium; If HWE=FALSE, this function will calculate the real average heterozygosities.

H <- RxpHetero(h,p,HWE=TRUE)

AlleleShare calculates the table of number of shared alleles for each pair of individuals at each locus.If replacement=TRUE, the pairs are formed with replacement; if replacement=FALSE, the pairs are formed without replacement. When the sample size is large, replacement=F is much faster.

AS<-AlleleShare(x,sep = "\\|",replacement = FALSE) 

RealProAlleleShare and ExpProAllelShare calculate the average proportions and the expected probabilities of sharing 0,1 and 2 alleles at each locus.

e <-RealProAlleleShare(AS)
e0<-ExpProAlleleShare(p)

HWE_Chisq test the Hardy-Weinberg Equilibrium with Pearson’s Chi-square test. B is an integer specifying the number of replicates used in the Monte Carlo test.


HWE_pvalue <-HWE.Chisq(G,G0,rescale.p = T,simulate.p.value = T,B=2000)