Discussion:
[Samtools-help] bgzip file index and fasta file
sebastien letort
2017-03-06 18:09:33 UTC
Permalink
Hi,

Only for my comprehension.

I've got a fasta file (many short sequences) that I compressed with
bgzip to grant random access.
I wrote a tiny code in python (with pysam) that parse my file.

I've noticed that a myfile.gz.fai has been generated whereas I already
had a myfile.gzi file index.

Does that means gzi index is not the index to use with fasta file ?
For what purpose gzi file is dedicated ?

Regards,
Sébastien
sebastien letort
2017-03-07 08:13:25 UTC
Permalink
Hi,

Only for my comprehension.

I've got a fasta file (many short sequences) that I compressed with
bgzip to grant random access.
I wrote a tiny code in python (with pysam) that parse my file.

I've noticed that a myfile.gz.fai has been generated whereas I already
had a myfile.gzi file index.

Does that means gzi index is not the index to use with fasta file ?
For what purpose gzi file is dedicated ?

Regards,
Sébastien
Colin
2017-03-08 15:42:13 UTC
Permalink
Technically you can get random access without using bgzip on the fasta file

You can just do "samtools faidx plainfasta.fa" and the plain text
fasta+fasta index(fai) allows random access.

I can't comment on why both gzi and fai are both necessary in the
bgzip case though!

-Colin
Post by sebastien letort
Hi,
Only for my comprehension.
I've got a fasta file (many short sequences) that I compressed with
bgzip to grant random access.
I wrote a tiny code in python (with pysam) that parse my file.
I've noticed that a myfile.gz.fai has been generated whereas I already
had a myfile.gzi file index.
Does that means gzi index is not the index to use with fasta file ?
For what purpose gzi file is dedicated ?
Regards,
Sébastien
------------------------------------------------------------
------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
Samtools-help mailing list
https://lists.sourceforge.net/lists/listinfo/samtools-help
sebastien letort
2017-03-08 15:50:29 UTC
Permalink
Thanks Colin.

In fact I used bgzip to compress data (saving space).
It just seems that I don't need the gzi index and I wonder in which case
this file is used.
Post by Colin
Technically you can get random access without using bgzip on the fasta file
You can just do "samtools faidx plainfasta.fa" and the plain text
fasta+fasta index(fai) allows random access.
I can't comment on why both gzi and fai are both necessary in the
bgzip case though!
-Colin
On Tue, Mar 7, 2017 at 3:13 AM, sebastien letort
Hi,
Only for my comprehension.
I've got a fasta file (many short sequences) that I compressed with
bgzip to grant random access.
I wrote a tiny code in python (with pysam) that parse my file.
I've noticed that a myfile.gz.fai has been generated whereas I already
had a myfile.gzi file index.
Does that means gzi index is not the index to use with fasta file ?
For what purpose gzi file is dedicated ?
Regards,
Sébastien
------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
Samtools-help mailing list
https://lists.sourceforge.net/lists/listinfo/samtools-help
<https://lists.sourceforge.net/lists/listinfo/samtools-help>
Loading...