LDAK-KVIK output
Below, we discuss the output files generated by LDAK-KVIK, when using the command lines:
./ldak6.linux --kvik-step1 kvik --bfile data --pheno phenofile --covar covfile --max-threads 4
./ldak6.linux --kvik-step2 kvik --bfile data --pheno phenofile --covar covfile --max-threads 4
./ldak6.linux --kvik-step3 kvik --bfile data --genefile genefile --max-threads 4
Step 1
Step 1 of LDAK-KVIK generates five files: kvik.step1.progress
, kvik.step1.root
, kvik.step1.loco.details
,kvik.step1.loco.prs
and kvik.step1.effects
. The kvik.step1.progress
file contains the screen output of Step 1. The kvik.step1.root
file contains information on the input arguments, which are used to pass on to the next steps. kvik.step1.loco.detais
contains information on the scaling estimate, power parameter and heritability estimate, and kvik.step1.loco.prs
contains the leave-one-chromosome-out predictors constructed using the elastic net. Finally, the kvik.step1.effects
file contains the SNP effects of the full elastic net model based on all chromosomes.
Step 2
The second step of LDAK-KVIK performs single-SNP analysis using the PRS of step 1 as offset. The fitted regression coefficients of covariates are included in kvik.step2.coeff
, and the results of single-SNP analysis are included in kvik.step2.pvalues
, kvik.step2.summaries
and kvik.step2.assoc
. The kvik.step2.progress
file contains the verbose of --kvik-step2
.
The files kvik.step2.pvalues
and kvik.step2.summaries
contain information on P values and Z-scores per SNP, and kvik.step2.assoc
contains the complete summary statistics. An example header of kvik.step2.assoc
is:
Chromosome Predictor Basepair A1 A2 Wald_Stat Wald_P Effect SD A1_mean MAF SPA_Status
1 SNP1 10000 A C -0.4717 6.3715e-01 -6.9038e-03 1.4636e-02 0.241400 0.120700 NOT_USED
1 SNP2 20000 A C 0.0488 9.6105e-01 4.7483e-04 9.7227e-03 0.843100 0.421550 NOT_USED
1 SNP3 30000 A C 1.0593 2.8945e-01 1.0828e-02 1.0222e-02 0.636000 0.318000 NOT_USED
1 SNP4 40000 A C 0.4545 6.4948e-01 5.0009e-03 1.1003e-02 0.502900 0.251450 NOT_USED
1 SNP5 50000 A C -0.5006 6.1664e-01 -5.5770e-03 1.1140e-02 0.475200 0.237600 NOT_USED
...
The regression coefficients and standard errors are respectively given in the Effect
and SD
columns. Wald_Stat
contains the Z-scores, and Wald_P
contains the associated P values. SPA_Status
indicates whether the saddlepoint approximation has been applied.
Step 3
The third step of LDAK-KVIK generates results from gene-based association analysis. The summary statistics are saved in kvik.step3.remls.all
, for example:
Gene_Name Gene_Chr Gene_Start Gene_End Length Heritability SD Null_Likelihood Alt_Likelihood LRT_Stat LRT_P_Raw LRT_P_Perm
OR4F5 1 69091 70008 1 0.000075 0.000210 -19830.5503 -19830.3884 0.3239 2.8465e-01 1.7152e-01
LOC100996442 1 142447 174392 3 0.000001 NA -19830.5503 -19830.5503 0.0000 7.5000e-01 6.8280e-01
SAMD11 1 859993 879961 2 0.000001 NA -19830.5503 -19830.5503 0.0000 7.5000e-01 6.8280e-01
NOC2L 1 879583 894679 2 0.000001 NA -19830.5503 -19830.5503 0.0000 7.5000e-01 6.8280e-01
KLHL17 1 895967 901099 1 0.000185 0.000366 -19830.5503 -19829.9048 1.2912 1.2792e-01 7.4051e-02
PLEKHN1 1 901872 910488 1 0.000001 NA -19830.5503 -19830.5503 0.0000 7.5000e-01 6.8280e-01
The column Gene_Name
indicates the gene names, and P values are stored in the LRT_P_Perm
column. For an overview of the gene-based association method, we refer to the LDAK-GBAT publication.