1. ####ERROR MESSAGE: SAM/BAM file xxx.bam.bam is malformed: SAM file doesn’t have any read groups defined in the header. The GATK no longer supports SAM files without read groups
solution:
specify -r “@RG\tID:sample\tLB:sample\tPL:ILLUMINA\tSM:sample” when running bwa alignment
or use picard (AddOrReplaceReadGroups.jar) to add read group such as "ID:1 PL:illumina PU:NA LB:BC SM:140"
2.##### ERROR MESSAGE: Lexicographically sorted human genome sequence detected in reads.
##### ERROR For safety's sake the GATK requires human contigs in karyotypic order: 1, 2, ..., 10, 11, ..., 20, 21, 22, X, Y with M either leading or trailing these contigs.
Solution:
use picard (ReorderSam.jar) to reorder the bam file, however, this does not reorder the chromosome order in the bam header (The picard version I used was picard-tools-1.72), to solve this problem, run picard ReorderSam.jar first, and then run samtools redeader to replace the bam header (karyotypically ordered).
samtools redeader header.sam in.sam
Notice: the content in header.sam is tab separated
cat header.sam
@HD VN:1.0 SO:coordinate
@SQ SN:chr1 LN:249250621
@SQ SN:chr10 LN:135534747
@SQ SN:chr11 LN:135006516
@SQ SN:chr12 LN:133851895
@SQ SN:chr13 LN:115169878
@SQ SN:chr14 LN:107349540
@SQ SN:chr15 LN:102531392
@SQ SN:chr16 LN:90354753
@SQ SN:chr17 LN:81195210
@SQ SN:chr18 LN:78077248
@SQ SN:chr19 LN:59128983
@SQ SN:chr2 LN:243199373
@SQ SN:chr20 LN:63025520
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chrM LN:16571
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@RG ID:1 PL:illumina PU:NA LB:BC SM:140
Read Full Post »