<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Short Read Aligners: Maq, Eland, and Others</title>
	<atom:link href="http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/feed" rel="self" type="application/rss+xml" />
	<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html</link>
	<description>Medical genomics in the post-genome era</description>
	<lastBuildDate>Fri, 27 Apr 2012 18:07:26 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: bezo</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-109</link>
		<dc:creator>bezo</dc:creator>
		<pubDate>Wed, 09 Jul 2008 04:21:03 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-109</guid>
		<description>A new short-read aligner, novoalign, has just been released. It’s free to use, you dont need expensive hardware and works pretty well with Illumina reads.

Some features are:

* Gaps up to 7bp, affine gap penalties
* Can handle ambiguous codes in ref sequence.
* Quality-based scoring
* Adapter stripping for miRNA reads
* No heuristics - reports the best alignment
* Options for handling multiple alignments includes none, random, all alignments.
* Alignment Quality scores
* Can use fasta, fastq, solexa fastq, prb input formats
* Paired end with full Needleman-Wunsch on both ends.
* Paired end accepts a structural variation penalty and the best alignment may be two independent ends if score with SV penalty is better than the best pair that fits the fragment length distribution.
* Supports variable read lengths
* Includes optional soft masking of repeats.
* Iterative read trimming

Give it a whirl. In terms of performance it’s quite fast, some users on seqanswers.com have commented that it runs faster than the SOAP program.

Have a read on the website http://www.novocraft.com and download the executables for 64-bit Linux and Mac OS 10.5.3.</description>
		<content:encoded><![CDATA[<p>A new short-read aligner, novoalign, has just been released. It’s free to use, you dont need expensive hardware and works pretty well with Illumina reads.</p>
<p>Some features are:</p>
<p>* Gaps up to 7bp, affine gap penalties<br />
* Can handle ambiguous codes in ref sequence.<br />
* Quality-based scoring<br />
* Adapter stripping for miRNA reads<br />
* No heuristics &#8211; reports the best alignment<br />
* Options for handling multiple alignments includes none, random, all alignments.<br />
* Alignment Quality scores<br />
* Can use fasta, fastq, solexa fastq, prb input formats<br />
* Paired end with full Needleman-Wunsch on both ends.<br />
* Paired end accepts a structural variation penalty and the best alignment may be two independent ends if score with SV penalty is better than the best pair that fits the fragment length distribution.<br />
* Supports variable read lengths<br />
* Includes optional soft masking of repeats.<br />
* Iterative read trimming</p>
<p>Give it a whirl. In terms of performance it’s quite fast, some users on seqanswers.com have commented that it runs faster than the SOAP program.</p>
<p>Have a read on the website <a href="http://www.novocraft.com" rel="nofollow">http://www.novocraft.com</a> and download the executables for 64-bit Linux and Mac OS 10.5.3.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: zayedi</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-105</link>
		<dc:creator>zayedi</dc:creator>
		<pubDate>Tue, 17 Jun 2008 10:58:39 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-105</guid>
		<description>I&#039;m still not so keen on using eland in situations when my reads are less than 25bp but greater than 20bp e.g with microRNAs sequenced from Solexa.  The FP question can only reliably answered on testing various scenarios  where we actually know where our reads will map.  
Something I&#039;ve also seen is that you may not align a read where there are ambiguous characters around the mid-section of the short sequence read. 
Using paired-end will definitely push up FP but with higher read errors, indels, etc, the curve for PE alignment may look better.</description>
		<content:encoded><![CDATA[<p>I&#8217;m still not so keen on using eland in situations when my reads are less than 25bp but greater than 20bp e.g with microRNAs sequenced from Solexa.  The FP question can only reliably answered on testing various scenarios  where we actually know where our reads will map.<br />
Something I&#8217;ve also seen is that you may not align a read where there are ambiguous characters around the mid-section of the short sequence read.<br />
Using paired-end will definitely push up FP but with higher read errors, indels, etc, the curve for PE alignment may look better.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ac</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-104</link>
		<dc:creator>ac</dc:creator>
		<pubDate>Mon, 19 May 2008 15:36:09 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-104</guid>
		<description>&quot;The main contributor (other is read quality) to FPs is repeats or near repeats so a read has alternative alignment locations. Sometimes this will result in one alignment which may turn out to be a FP. Sometimes we’ll get multiple alignments with the same score. With MAQ, its habit of randomly choosing one alignment from a set of equal scoring alignments introduces some FP and some extra (random) TP. Other tools just report the read as having multiple alignments which some evaluations report as a FN.&quot;

We generally get around this by providing only uniquely aligned reads from Eland as input to MAQ.  The obvious drawback to this is that you may be losing some reads.  However, I haven&#039;t seen any drop in depth beyond 1X unless it is a highly repetitive region.  Fortunately we did see some FPs go away with this approach but does anyone see any drawbacks?</description>
		<content:encoded><![CDATA[<p>&#8220;The main contributor (other is read quality) to FPs is repeats or near repeats so a read has alternative alignment locations. Sometimes this will result in one alignment which may turn out to be a FP. Sometimes we’ll get multiple alignments with the same score. With MAQ, its habit of randomly choosing one alignment from a set of equal scoring alignments introduces some FP and some extra (random) TP. Other tools just report the read as having multiple alignments which some evaluations report as a FN.&#8221;</p>
<p>We generally get around this by providing only uniquely aligned reads from Eland as input to MAQ.  The obvious drawback to this is that you may be losing some reads.  However, I haven&#8217;t seen any drop in depth beyond 1X unless it is a highly repetitive region.  Fortunately we did see some FPs go away with this approach but does anyone see any drawbacks?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Justin</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-103</link>
		<dc:creator>Justin</dc:creator>
		<pubDate>Mon, 19 May 2008 03:26:29 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-103</guid>
		<description>I am under the impression that the random assignment protocol in MAQ for repeats results in a lower mapping quality score for such assigned reads.

Can anyone here point me in a direction to understand how SNP calling is conditioned on mapping quality score for MAQ?

I couldn&#039;t find it on the MAQ page.</description>
		<content:encoded><![CDATA[<p>I am under the impression that the random assignment protocol in MAQ for repeats results in a lower mapping quality score for such assigned reads.</p>
<p>Can anyone here point me in a direction to understand how SNP calling is conditioned on mapping quality score for MAQ?</p>
<p>I couldn&#8217;t find it on the MAQ page.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MB</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-102</link>
		<dc:creator>MB</dc:creator>
		<pubDate>Fri, 16 May 2008 20:47:37 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-102</guid>
		<description>At the current version, eland itself does not do paired end alignment, but there is other scripts in the GAPipeline which can achieve this by post-processing eland output. Another script calculate mapping qualities. Unfortunately, not many people know how to use them as they have not been well documented, so far as I know.

In addition, it is easy to benchmark alignment, but it is quite hard to bechmark SNP calling with simulation. A lot of troubles that cause wrong SNPs can hardly be simulated accurately as people are simply not aware of them.</description>
		<content:encoded><![CDATA[<p>At the current version, eland itself does not do paired end alignment, but there is other scripts in the GAPipeline which can achieve this by post-processing eland output. Another script calculate mapping qualities. Unfortunately, not many people know how to use them as they have not been well documented, so far as I know.</p>
<p>In addition, it is easy to benchmark alignment, but it is quite hard to bechmark SNP calling with simulation. A lot of troubles that cause wrong SNPs can hardly be simulated accurately as people are simply not aware of them.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sparks</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-101</link>
		<dc:creator>Sparks</dc:creator>
		<pubDate>Fri, 16 May 2008 00:58:33 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-101</guid>
		<description>The main contributor (other is read quality) to FPs is repeats or near repeats so a read has alternative alignment locations. Sometimes this will result in one alignment which may turn out to be a FP. Sometimes we&#039;ll get multiple alignments with the same score. With MAQ, its habit of randomly choosing one alignment from a set of equal scoring alignments introduces some FP and some extra (random) TP. Other tools just report the read as having multiple alignments which some evaluations report as a FN.</description>
		<content:encoded><![CDATA[<p>The main contributor (other is read quality) to FPs is repeats or near repeats so a read has alternative alignment locations. Sometimes this will result in one alignment which may turn out to be a FP. Sometimes we&#8217;ll get multiple alignments with the same score. With MAQ, its habit of randomly choosing one alignment from a set of equal scoring alignments introduces some FP and some extra (random) TP. Other tools just report the read as having multiple alignments which some evaluations report as a FN.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ML</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-100</link>
		<dc:creator>ML</dc:creator>
		<pubDate>Thu, 15 May 2008 20:22:57 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-100</guid>
		<description>Yes! You can run Eland independently - just get the binaries (eland_?? where ?? is the length of the reads and squashGenome) and the prompt is reasonably well documented. Cheers. It is definitely the fastest gun around.</description>
		<content:encoded><![CDATA[<p>Yes! You can run Eland independently &#8211; just get the binaries (eland_?? where ?? is the length of the reads and squashGenome) and the prompt is reasonably well documented. Cheers. It is definitely the fastest gun around.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ac</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-99</link>
		<dc:creator>ac</dc:creator>
		<pubDate>Thu, 15 May 2008 16:45:48 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-99</guid>
		<description>From what I can see there are three high level steps required for SNP calling.   Mapping, Assembling and SNP calling.  Which tools are best for each component would be interesting.  MAQ currently does them all but so does ssaha_pileup and Mosiak.  After using these two tools I agree with Heng MAQ is definitely much user friendly.  Based on Heng&#039;s poster the FP SNP rate is practically zero for both PE and SE reads but we have experienced a much higher FP rate using SE reads.  Does anyone have any data on MAQ&#039;s true FP rate and what can contribute to FPs?  Some obvious causes would be contamination, amplification bias, and sequencing error.</description>
		<content:encoded><![CDATA[<p>From what I can see there are three high level steps required for SNP calling.   Mapping, Assembling and SNP calling.  Which tools are best for each component would be interesting.  MAQ currently does them all but so does ssaha_pileup and Mosiak.  After using these two tools I agree with Heng MAQ is definitely much user friendly.  Based on Heng&#8217;s poster the FP SNP rate is practically zero for both PE and SE reads but we have experienced a much higher FP rate using SE reads.  Does anyone have any data on MAQ&#8217;s true FP rate and what can contribute to FPs?  Some obvious causes would be contamination, amplification bias, and sequencing error.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick Hermersmann</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-98</link>
		<dc:creator>Nick Hermersmann</dc:creator>
		<pubDate>Thu, 15 May 2008 16:31:50 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-98</guid>
		<description>Have you tried DNAstar?</description>
		<content:encoded><![CDATA[<p>Have you tried DNAstar?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: zayedi</title>
		<link>http://massgenomics.org/2008/05/short-read-aligners-maq-eland-and-others.html/comment-page-1#comment-97</link>
		<dc:creator>zayedi</dc:creator>
		<pubDate>Thu, 15 May 2008 15:45:54 +0000</pubDate>
		<guid isPermaLink="false">http://massgenomics.wordpress.com/?p=15#comment-97</guid>
		<description>I like the idea of  an independent contest. I&#039;m actually working on these sort of problems at the moment and hopefully it will shape up into a nice little publication.  I&#039;d be happy to share some of my insights if you&#039;re interested.</description>
		<content:encoded><![CDATA[<p>I like the idea of  an independent contest. I&#8217;m actually working on these sort of problems at the moment and hopefully it will shape up into a nice little publication.  I&#8217;d be happy to share some of my insights if you&#8217;re interested.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

