Discussion:
[Samtools-help] bcftools merge : multiple rows per position
Davis, Steven
2016-11-21 14:12:27 UTC
Permalink
Anyone have any ideas? Still looking for help with this.

From: Davis, Steven [mailto:***@fda.hhs.gov]
Sent: Thursday, November 17, 2016 3:24 PM
To: samtools-***@lists.sourceforge.net
Subject: [Samtools-help] bcftools merge multiple rows per position

Hello support,

I am merging VCF files with bcftools merge using this command:

bcftools merge --info-rules NS:sum -o outFilePath dir/*.gz

Previously, using bcftools 1.1, the output VCF file contained one row per position, like this:

#CHROM POS ID REF ALT QUAL FILTER INFO
barref 1094 . A G . PASS NS=24
barref 1221 . G A . PASS NS=24
barref 2015 . G A . PASS NS=24

Using newer version of bcftools 1.2 or 1.3.1, I am seeing two rows per position, like this:

#CHROM POS ID REF ALT QUAL FILTER INFO
barref 1094 . A . . PASS NS=20
barref 1094 . A G . PASS NS=4
barref 1221 . G . . PASS NS=20
barref 1221 . G A . PASS NS=4
barref 2015 . G . . PASS NS=20
barref 2015 . G A . PASS NS=4

How can I get one row of output per position when using bcftools 1.2 or higher?

Thanks,

Steve Davis
Office of Analytics and Outreach
FDA Center for Food Safety and Applied Nutrition
5100 Paint Branch Pkwy
College Park, MD 20740
Office: 240-402-4834
Thomas W. Blackwell
2016-11-21 14:45:10 UTC
Permalink
Steve -

Totally guessing. Is there something different between the two
.vcf input files being merged which bcftools would interpret as
an informative ALT allele in the file with 4 samples and a "."
ALT allele in the file with 20 samples ? Something different in
the .vcf header blocks ? Something different in the formatting ?
Or, literal "." alternate alleles ?

Is it plink or plink-seq which allows you to reset the alternate
allele for each marker ? (Might be easier than hand-editing the
.vcf file if that is what's needed.)

Overall, I suspect that there's something about the input .vcf
files which is causing this, and later versions of bcftools are
stricter about checking it than version 1.1. Again, just guessing.

- tom blackwell -
Post by Davis, Steven
Anyone have any ideas? Still looking for help with this.
Sent: Thursday, November 17, 2016 3:24 PM
Subject: [Samtools-help] bcftools merge multiple rows per position
Hello support,
bcftools merge --info-rules NS:sum -o outFilePath dir/*.gz
#CHROM POS ID REF ALT QUAL FILTER INFO
barref 1094 . A G . PASS NS=24
barref 1221 . G A . PASS NS=24
barref 2015 . G A . PASS NS=24
#CHROM POS ID REF ALT QUAL FILTER INFO
barref 1094 . A . . PASS NS=20
barref 1094 . A G . PASS NS=4
barref 1221 . G . . PASS NS=20
barref 1221 . G A . PASS NS=4
barref 2015 . G . . PASS NS=20
barref 2015 . G A . PASS NS=4
How can I get one row of output per position when using bcftools 1.2 or higher?
Thanks,
Steve Davis
Office of Analytics and Outreach
FDA Center for Food Safety and Applied Nutrition
5100 Paint Branch Pkwy
College Park, MD 20740
Office: 240-402-4834
------------------------------------------------------------------------------
Davis, Steven
2016-11-23 16:46:07 UTC
Permalink
Problem solved with "--merge all" command line option.

-----Original Message-----
From: Thomas W. Blackwell [mailto:***@umich.edu]
Sent: Monday, November 21, 2016 9:45 AM
To: Davis, Steven
Cc: samtools-***@lists.sourceforge.net
Subject: Re: [Samtools-help] bcftools merge : multiple rows per position

Steve -

Totally guessing. Is there something different between the two
.vcf input files being merged which bcftools would interpret as
an informative ALT allele in the file with 4 samples and a "."
ALT allele in the file with 20 samples ? Something different in
the .vcf header blocks ? Something different in the formatting ?
Or, literal "." alternate alleles ?

Is it plink or plink-seq which allows you to reset the alternate
allele for each marker ? (Might be easier than hand-editing the
.vcf file if that is what's needed.)

Overall, I suspect that there's something about the input .vcf
files which is causing this, and later versions of bcftools are
stricter about checking it than version 1.1. Again, just guessing.

- tom blackwell -
Post by Davis, Steven
Anyone have any ideas? Still looking for help with this.
Sent: Thursday, November 17, 2016 3:24 PM
Subject: [Samtools-help] bcftools merge multiple rows per position
Hello support,
bcftools merge --info-rules NS:sum -o outFilePath dir/*.gz
#CHROM POS ID REF ALT QUAL FILTER INFO
barref 1094 . A G . PASS NS=24
barref 1221 . G A . PASS NS=24
barref 2015 . G A . PASS NS=24
#CHROM POS ID REF ALT QUAL FILTER INFO
barref 1094 . A . . PASS NS=20
barref 1094 . A G . PASS NS=4
barref 1221 . G . . PASS NS=20
barref 1221 . G A . PASS NS=4
barref 2015 . G . . PASS NS=20
barref 2015 . G A . PASS NS=4
How can I get one row of output per position when using bcftools 1.2 or higher?
Thanks,
Steve Davis
Office of Analytics and Outreach
FDA Center for Food Safety and Applied Nutrition
5100 Paint Branch Pkwy
College Park, MD 20740
Office: 240-402-4834
------------------------------------------------------------------------------
Loading...