Skip to content

Commit

Permalink
build based on 88cb2b8
Browse files Browse the repository at this point in the history
  • Loading branch information
Documenter.jl committed Aug 14, 2023
1 parent e6509dc commit 24820c4
Show file tree
Hide file tree
Showing 5 changed files with 20 additions and 5 deletions.
2 changes: 1 addition & 1 deletion dev/index.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions dev/kmer_count/index.html
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<!DOCTYPE html>
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>The KmerCount type · VectorizedKmers.jl</title><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.045/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.24/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit"><a href="../">VectorizedKmers.jl</a></span></div><form class="docs-search" action="../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li class="is-active"><a class="tocitem" href>The KmerCount type</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>The KmerCount type</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>The KmerCount type</a></li></ul></nav><div class="docs-right"><a class="docs-edit-link" href="https://github.com/anton083/VectorizedKmers.jl/blob/main/docs/src/kmer_count.md" title="Edit on GitHub"><span class="docs-icon fab"></span><span class="docs-label is-hidden-touch">Edit on GitHub</span></a><a class="docs-settings-button fas fa-cog" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-sidebar-button fa fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a></div></header><article class="content" id="documenter-page"><h1 id="The-KmerCount-type"><a class="docs-heading-anchor" href="#The-KmerCount-type">The <code>KmerCount</code> type</a><a id="The-KmerCount-type-1"></a><a class="docs-heading-anchor-permalink" href="#The-KmerCount-type" title="Permalink"></a></h1><p>The <code>KmerCount</code> type has four type parameters, but you only really need to care about the first two: <code>A</code>, the alphabet size, and <code>K</code>, the K-mer length. So, to count the 6-mers of a DNA sequence, you would use <code>KmerCount{4, 6}</code>. For each of these K-mer counts, memory for a vector of size <code>A^K</code> is allocated, unless a vector type like <code>SparseVector</code> is used. This brings us to the two other type parameters: <code>T</code>, which is the vector element type, and <code>V</code>, which is the type of the actual vector.</p><p>Let&#39;s see it in action:</p><pre><code class="language-julia-repl hljs">julia&gt; kc = KmerCount{4, 2}(); # creates a Vector{Int} of zeros with length 4^2
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>The KmerCount type · VectorizedKmers.jl</title><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.045/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.24/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit"><a href="../">VectorizedKmers.jl</a></span></div><form class="docs-search" action="../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../kmer_int_repr/">Integer representation of k-mers</a></li><li class="is-active"><a class="tocitem" href>The KmerCount type</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>The KmerCount type</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>The KmerCount type</a></li></ul></nav><div class="docs-right"><a class="docs-edit-link" href="https://github.com/anton083/VectorizedKmers.jl/blob/main/docs/src/kmer_count.md" title="Edit on GitHub"><span class="docs-icon fab"></span><span class="docs-label is-hidden-touch">Edit on GitHub</span></a><a class="docs-settings-button fas fa-cog" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-sidebar-button fa fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a></div></header><article class="content" id="documenter-page"><h1 id="The-KmerCount-type"><a class="docs-heading-anchor" href="#The-KmerCount-type">The <code>KmerCount</code> type</a><a id="The-KmerCount-type-1"></a><a class="docs-heading-anchor-permalink" href="#The-KmerCount-type" title="Permalink"></a></h1><p>The <code>KmerCount</code> type has four type parameters, but you only really need to care about the first two: <code>A</code>, the alphabet size, and <code>K</code>, the K-mer length. So, to count the 6-mers of a DNA sequence, you would use <code>KmerCount{4, 6}</code>. For each of these K-mer counts, memory for a vector of size <code>A^K</code> is allocated, unless a vector type like <code>SparseVector</code> is used. This brings us to the two other type parameters: <code>T</code>, which is the vector element type, and <code>V</code>, which is the type of the actual vector.</p><p>Let&#39;s see it in action:</p><pre><code class="language-julia-repl hljs">julia&gt; kc = KmerCount{4, 2}(); # creates a Vector{Int} of zeros with length 4^2

julia&gt; using BioSequences # a weak dependency that lets us count kmers of LongDNA{4} sequences

Expand All @@ -16,4 +16,4 @@
julia&gt; count_kmers!(kc, dna&quot;ACGT&quot;, reset=false); # avoid reset with reset=false

julia&gt; @show kc; # look! we counted 2-mers of ACGT twice
kc = [0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0]</code></pre></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../">« Home</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 0.27.25 on <span class="colophon-date" title="Monday 14 August 2023 09:55">Monday 14 August 2023</span>. Using Julia version 1.9.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
kc = [0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0]</code></pre></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../kmer_int_repr/">« Integer representation of k-mers</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 0.27.25 on <span class="colophon-date" title="Monday 14 August 2023 11:14">Monday 14 August 2023</span>. Using Julia version 1.9.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
15 changes: 15 additions & 0 deletions dev/kmer_int_repr/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
<!DOCTYPE html>
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Integer representation of k-mers · VectorizedKmers.jl</title><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.045/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.24/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit"><a href="../">VectorizedKmers.jl</a></span></div><form class="docs-search" action="../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li class="is-active"><a class="tocitem" href>Integer representation of k-mers</a><ul class="internal"><li><a class="tocitem" href="#DNA-sequences"><span>DNA sequences</span></a></li><li><a class="tocitem" href="#Amino-acid-sequences"><span>Amino acid sequences</span></a></li></ul></li><li><a class="tocitem" href="../kmer_count/">The KmerCount type</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>Integer representation of k-mers</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Integer representation of k-mers</a></li></ul></nav><div class="docs-right"><a class="docs-edit-link" href="https://github.com/anton083/VectorizedKmers.jl/blob/main/docs/src/kmer_int_repr.md" title="Edit on GitHub"><span class="docs-icon fab"></span><span class="docs-label is-hidden-touch">Edit on GitHub</span></a><a class="docs-settings-button fas fa-cog" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-sidebar-button fa fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a></div></header><article class="content" id="documenter-page"><h1 id="Integer-representation-of-k-mers"><a class="docs-heading-anchor" href="#Integer-representation-of-k-mers">Integer representation of k-mers</a><a id="Integer-representation-of-k-mers-1"></a><a class="docs-heading-anchor-permalink" href="#Integer-representation-of-k-mers" title="Permalink"></a></h1><p>This package relies on representing K-mers as integers for indexing, and understanding how it works is recommended (unless you&#39;re only using higher-level API stuff).</p><h2 id="DNA-sequences"><a class="docs-heading-anchor" href="#DNA-sequences">DNA sequences</a><a id="DNA-sequences-1"></a><a class="docs-heading-anchor-permalink" href="#DNA-sequences" title="Permalink"></a></h2><p>For DNA, each non-ambiguous nucleotide is assigning a number between 0 and 3:</p><table><tr><th style="text-align: right">Nucleotide</th><th style="text-align: right">Base-4</th><th style="text-align: right">Base-2</th></tr><tr><td style="text-align: right">A</td><td style="text-align: right">0</td><td style="text-align: right">00</td></tr><tr><td style="text-align: right">C</td><td style="text-align: right">1</td><td style="text-align: right">01</td></tr><tr><td style="text-align: right">G</td><td style="text-align: right">2</td><td style="text-align: right">10</td></tr><tr><td style="text-align: right">T</td><td style="text-align: right">3</td><td style="text-align: right">11</td></tr></table><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>Any ordering works, but this one is the one used by <a href="https://github.com/BioJulia/BioSequences.jl">BioSequences.jl</a>, and it also has some nice properties, like being in alphabetical order, and that XOR-ing a base with 3 gives you its complement.</p></div></div><p>We can technically convert <em>any</em> DNA sequence to an integer, but 64-bit integers limit us to 32-mers.</p><p>Consider the DNA sequence <code>GATTACA</code>. If we convert it to an integer using the table above, we get <span>$2033010_4 = 10001111000100_2 = 9156_{10}$</span>, so the integer value of <code>GATTACA</code> is 9156. Since Julia uses 1-based indexing, we would add 1 to this value to get the index of <code>GATTACA</code> in the vector.</p><p>If we want to write a function for this, we may do something like the following:</p><pre><code class="language-julia hljs">const DNA_ENCODING_VECTOR = zeros(Int, 127)

for (i, c) in enumerate(&quot;ACGT&quot;)
DNA_ENCODING_VECTOR[c % Int8] = i - 1
end

function kmer_to_int(kmer::String)
kmer_int = 0
for nuc in kmer
kmer_int = (kmer_int &lt;&lt; 2) | DNA_ENCODING_VECTOR[nuc % Int8]
end
kmer_int
end</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>Strings are bad! Don&#39;t use strings! Please! They&#39;re bad! Very bad! Use something like LongDNA{4}, please! Chars have variable length in Julia, so indexing and taking lengths and stuff is kinda slow! Don&#39;t use strings! Again: very bad! Use LongDNA{4}!</p></div></div><p>This function is not very efficient, since we&#39;re using <code>String</code>, but it works. Let&#39;s test it:</p><pre><code class="language-julia-repl hljs">julia&gt; kmer_to_int(&quot;GATTACA&quot;)
9156</code></pre><h2 id="Amino-acid-sequences"><a class="docs-heading-anchor" href="#Amino-acid-sequences">Amino acid sequences</a><a id="Amino-acid-sequences-1"></a><a class="docs-heading-anchor-permalink" href="#Amino-acid-sequences" title="Permalink"></a></h2><p>Amino acid sequences are a little more difficult to deal with since there are a lot more of them, and the vectors would grow in size even quicker. However, we can still represent them as integers, but we can&#39;t use bit manipulation anymore.</p><p>BioSequences.jl has 28 amino acids in its AminoAcidAlphabet, so we can represent each amino acid as an integer between 0 and 27.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../">« Home</a><a class="docs-footer-nextpage" href="../kmer_count/">The KmerCount type »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 0.27.25 on <span class="colophon-date" title="Monday 14 August 2023 11:14">Monday 14 August 2023</span>. Using Julia version 1.9.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
Loading

0 comments on commit 24820c4

Please sign in to comment.