Skip to content

Commit

Permalink
rm Frame type, support .cif files, and more
Browse files Browse the repository at this point in the history
  • Loading branch information
AntonOresten committed Jul 19, 2024
1 parent ab78fea commit a1d81bf
Show file tree
Hide file tree
Showing 17 changed files with 3,455 additions and 247 deletions.
4 changes: 3 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
name = "Backboner"
uuid = "9ac9c2a2-1cfe-46d3-b3fd-6fa470ea56a7"
authors = ["Anton Oresten <anton.oresten42@gmail.com>"]
version = "0.10.2"
version = "0.11.0"

[deps]
BioStructures = "de9282ab-8554-53be-b2d6-f6c222edabfc"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
NNlib = "872c559c-99b0-510c-b3b7-b6c96a88d5cd"
PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
Rotations = "6038ab10-8711-5258-84ad-4b1120ba62dc"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
Expand All @@ -18,6 +19,7 @@ ZygoteIdealizationExt = ["Zygote"]

[compat]
BioStructures = "4"
NNlib = "0.9"
LinearAlgebra = "1"
PrecompileTools = "1"
Rotations = "1"
Expand Down
16 changes: 9 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,14 @@
[![Build Status](https://github.com/MurrellGroup/Backboner.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/MurrellGroup/Backboner.jl/actions/workflows/CI.yml?query=branch%3Amain)
[![Coverage](https://codecov.io/gh/MurrellGroup/Backboner.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/MurrellGroup/Backboner.jl)

Backboner is a Julia package that offers a set of types and functions for working with molecular *backbones*: defined here as continuous chains of bonded atoms.[^1] The package provides a few different types for representing backbones:
This package offers types and functions for working with molecular *backbones*, defined here as continuous chains of bonded atoms.[^1]

Backbones can be represented with different types:
- `Backbone`: a type containing a 3xN matrix of coordinates
- `ChainedBonds`: a type that holds vectors of bond lengths, bond angles, and dihedral angles
- `Frames`: a collection of rotations and translations (e.g. for representing orientations and locations of protein residues)

The `Protein` submodule contains functions and types for working specifically with proteins. A protein can be loaded from a PDB file using the `Backboner.Protein.readpdb` function, which returns a `Vector{Backboner.Protein.Chain}`. Conversely, a `Vector{Backboner.Protein.Chain}` instance can be written to a PDB file using the `writepdb` function.
The `Protein` submodule contains utilities for working specifically with proteins. A protein can be loaded from a PDB file using the `Backboner.Protein.readpdb` function, which returns a `Vector{Backboner.Protein.Chain}`. Conversely, a `Vector{Backboner.Protein.Chain}` instance can be written to a PDB file using the `writepdb` function.

## Installation

Expand All @@ -32,29 +34,29 @@ julia> chains = readpdb("test/data/1ZAK.pdb")
Chain B with 220 residues

julia> backbone = chains[1].backbone
660-element Backbone{Float32, Matrix{Float32}}:
660-element Backbone{Float64, Matrix{Float64}}:
[22.346, 17.547, 23.294]
[22.901, 18.031, 21.993]
[23.227, 16.793, 21.163]
[24.115, 16.923, 20.175]
[24.478, 15.779, 19.336]
[21.48, 14.668, 4.974]
[21.480, 14.668, 4.974]
[22.041, 14.866, 3.569]
[21.808, 13.861, 2.734]
[22.263, 13.862, 1.355]
[21.085, 14.233, 0.446]

julia> ChainedBonds(backbone)
ChainedBonds{Float32, Vector{Float32}} with 659 bonds, 658 angles, and 657 dihedrals
ChainedBonds{Float64, Vector{Float64}} with 659 bonds, 658 angles, and 657 dihedrals

julia> is_knotted(backbone)
false

julia> import Zygote # unlock the `idealize` method for backbones

julia> idealize(backbone, Float32[1.46, 1.52, 1.33], Float32[1.94, 2.04, 2.13])
660-element Backbone{Float32, Matrix{Float32}}:
julia> idealize(backbone, Float64[1.46, 1.52, 1.33], Float64[1.94, 2.04, 2.13])
660-element Backbone{Float64, Matrix{Float64}}:
[22.348574, 17.582397, 23.289886]
[22.90583, 17.977451, 21.999538]
[23.216103, 16.762234, 21.140835]
Expand Down
8 changes: 4 additions & 4 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ julia> chains = readpdb("test/data/1ZAK.pdb")
Chain B with 220 residues

julia> backbone = chains[1].backbone
660-element Backbone{Float32, Matrix{Float32}}:
660-element Backbone{Float64, Matrix{Float64}}:
[22.346, 17.547, 23.294]
[22.901, 18.031, 21.993]
[23.227, 16.793, 21.163]
Expand All @@ -55,15 +55,15 @@ julia> backbone = chains[1].backbone
[21.085, 14.233, 0.446]

julia> ChainedBonds(backbone)
ChainedBonds{Float32, Vector{Float32}} with 659 bonds, 658 angles, and 657 dihedrals
ChainedBonds{Float64, Vector{Float64}} with 659 bonds, 658 angles, and 657 dihedrals

julia> is_knotted(backbone)
false

julia> import Zygote # unlock the `idealize` method for backbones

julia> idealize(backbone, Float32[1.46, 1.52, 1.33], Float32[1.94, 2.04, 2.13])
660-element Backbone{Float32, Matrix{Float32}}:
julia> idealize(backbone, Float64[1.46, 1.52, 1.33], Float64[1.94, 2.04, 2.13])
660-element Backbone{Float64, Matrix{Float64}}:
[22.348574, 17.582397, 23.289886]
[22.90583, 17.977451, 21.999538]
[23.216103, 16.762234, 21.140835]
Expand Down
7 changes: 3 additions & 4 deletions src/backbone.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,14 @@ The `Backbone` type is designed to efficiently store and manipulate the three-di
struct Backbone{T<:Real,M<:AbstractMatrix{T}} <: AbstractVector{AbstractVector{T}}
coords::M

function Backbone{T,M}(coords::M) where {T<:Real,M<:AbstractMatrix{T}}
function Backbone{T,M}(coords::M) where {T,M}
size(coords, 1) == 3 || throw(ArgumentError("Expected the first dimension of coords to have a size of 3"))
return new(coords)
end
end

Backbone{T}(coords::M) where {T <: Real, M <: AbstractMatrix{T}} = Backbone{T, M}(coords)
Backbone{T}(coords::AbstractArray{T}) where T <: Real = Backbone{T}(reshape(coords, size(coords, 1), :))
Backbone(coords::AbstractArray{T}) where T <: Real = Backbone{T}(coords)
Backbone(coords::M) where {T<:Real,M<:AbstractMatrix{T}} = Backbone{T,M}(coords)
Backbone(coords::A) where {T<:Real,A<:AbstractArray{T}} = Backbone(reshape(coords, size(coords, 1), :))

coords(backbone::Backbone) = backbone.coords

Expand Down
64 changes: 33 additions & 31 deletions src/bonds.jl
Original file line number Diff line number Diff line change
@@ -1,55 +1,57 @@
export get_atom_displacements, get_atom_distances
export get_bond_vectors, get_bond_lengths, get_bond_angles, get_dihedrals
export ChainedBonds
export append_bonds!, append_bonds
export prepend_bonds!, prepend_bonds

#= TODO:
lengths -> bond_lengths
angles -> bond_angles
dihedrals -> torsional_angles
=#

using LinearAlgebra
import Rotations: AngleAxis

column_sums(columns::AbstractMatrix{<:Real}) = vec(sum(columns, dims=1))
column_norms(columns::AbstractMatrix{<:Real}) = sqrt.(column_sums(abs2.(columns)))
column_dots(columns1::AbstractMatrix{T}, columns2::AbstractMatrix{T}) where T = column_sums(columns1 .* columns2)
normalize_columns(columns::AbstractMatrix{<:Real}) = columns ./ column_norms(columns)'
norms(A::AbstractArray{<:Real}; dims=1) = sqrt.(sum(abs2, A; dims))
dots(A1::AbstractArray{T}, A2::AbstractMatrix{T}; dims=1) where T <: Real = sum(A1 .* A2; dims)
normalize_slices(A::AbstractArray{<:Real}; dims=1) = A ./ norms(A; dims)

function get_atom_displacements(backbone::Backbone, start::Integer, step::Integer, stride::Integer)
return @view(backbone.coords[:, start+step:stride:end]) .- @view(backbone.coords[:, start:stride:end-step])
end
get_atom_displacements(backbone::Backbone, start::Integer, step::Integer, stride::Integer) =
backbone.coords[:, start+step:stride:end] .- backbone.coords[:, start:stride:end-step]

function get_atom_distances(backbone::Backbone, start::Integer, step::Integer, stride::Integer)
return column_norms(get_atom_displacements(backbone, start, step, stride))
end
get_atom_distances(backbone::Backbone, start::Integer, step::Integer, stride::Integer) =
norms(get_atom_displacements(backbone, start, step, stride))

_get_bond_lengths(bond_vectors::AbstractMatrix{<:Real}) = column_norms(bond_vectors)
_get_bond_lengths(bond_vectors::AbstractMatrix{<:Real}) = norms(bond_vectors)

function _get_bond_angles(bond_vectors::AbstractMatrix{T}) where T
us = @view(bond_vectors[:, begin:end-1])
vs = @view(bond_vectors[:, begin+1:end])
return π .- acos.(clamp.(column_dots(us, vs) ./ (column_norms(us) .* column_norms(vs)), -one(T), one(T)))
us = bond_vectors[:, begin:end-1]
vs = bond_vectors[:, begin+1:end]
return π .- acos.(clamp.(dots(us, vs) ./ (norms(us) .* norms(vs)), -one(T), one(T)))
end

function batched_cross_product(A::AbstractMatrix{T}, B::AbstractMatrix{T}) where T <: Real
function batched_cross(A::AbstractMatrix{T}, B::AbstractMatrix{T}) where T <: Real
C1 = selectdim(A, 1, 2) .* selectdim(B, 1, 3) .- selectdim(A, 1, 3) .* selectdim(B, 1, 2)
C2 = selectdim(A, 1, 3) .* selectdim(B, 1, 1) .- selectdim(A, 1, 1) .* selectdim(B, 1, 3)
C3 = selectdim(A, 1, 1) .* selectdim(B, 1, 2) .- selectdim(A, 1, 2) .* selectdim(B, 1, 1)
return [C1 C2 C3]'
end

function _get_dihedrals(bond_vectors::AbstractMatrix{<:Real})
crosses = batched_cross_product(@view(bond_vectors[:, begin:end-1]), @view(bond_vectors[:, begin+1:end]))
normalized_crosses = normalize_columns(crosses)
cross_crosses = batched_cross_product(@view(normalized_crosses[:, begin:end-1]), @view(normalized_crosses[:, begin+1:end]))
normalized_bond_vectors = normalize_columns(bond_vectors)
sin_values = column_dots(cross_crosses, @view(normalized_bond_vectors[:, begin+1:end-1]))
cos_values = column_dots(@view(normalized_crosses[:, begin:end-1]), @view(normalized_crosses[:, begin+1:end]))
crosses = batched_cross(bond_vectors[:, begin:end-1], bond_vectors[:, begin+1:end])
normalized_crosses = normalize_slices(crosses)
cross_crosses = batched_cross(normalized_crosses[:, begin:end-1], normalized_crosses[:, begin+1:end])
normalized_bond_vectors = normalize_slices(bond_vectors)
sin_values = dots(cross_crosses, normalized_bond_vectors[:, begin+1:end-1])
cos_values = dots(normalized_crosses[:, begin:end-1], normalized_crosses[:, begin+1:end])
dihedrals = atan.(sin_values, cos_values)
return dihedrals
end

get_bond_vectors(backbone::Backbone) = get_atom_displacements(backbone, 1, 1, 1)
get_bond_lengths(backbone::Backbone) = _get_bond_lengths(get_bond_vectors(backbone))
get_bond_angles(backbone::Backbone) = _get_bond_angles(get_bond_vectors(backbone))
get_dihedrals(backbone::Backbone) = _get_dihedrals(get_bond_vectors(backbone))
get_bond_lengths(backbone::Backbone) = _get_bond_lengths(get_bond_vectors(backbone)) |> vec
get_bond_angles(backbone::Backbone) = _get_bond_angles(get_bond_vectors(backbone)) |> vec
get_dihedrals(backbone::Backbone) = _get_dihedrals(get_bond_vectors(backbone)) |> vec

"""
ChainedBonds{T <: Real, V <: AbstractVector{T}}
Expand All @@ -60,21 +62,21 @@ A lazy way to store a backbone as a series of bond lengths, angles, and dihedral
```jldoctest
julia> backbone = Protein.readpdb("test/data/1ZAK.pdb")["A"].backbone
660-element Backbone{Float32, Matrix{Float32}}:
660-element Backbone{Float64, Matrix{Float64}}:
[22.346, 17.547, 23.294]
[22.901, 18.031, 21.993]
[23.227, 16.793, 21.163]
[24.115, 16.923, 20.175]
[24.478, 15.779, 19.336]
[21.48, 14.668, 4.974]
[21.480, 14.668, 4.974]
[22.041, 14.866, 3.569]
[21.808, 13.861, 2.734]
[22.263, 13.862, 1.355]
[21.085, 14.233, 0.446]
julia> bonds = ChainedBonds(backbone)
ChainedBonds{Float32, Vector{Float32}} with 659 bonds, 658 angles, and 657 dihedrals
ChainedBonds{Float64, Vector{Float64}} with 659 bonds, 658 angles, and 657 dihedrals
```
"""
struct ChainedBonds{T <: Real, V <: AbstractVector{T}}
Expand Down Expand Up @@ -102,9 +104,9 @@ Base.reverse(bonds::ChainedBonds) = reverse!(deepcopy(bonds))

function ChainedBonds(backbone::Backbone)
bond_vectors = get_bond_vectors(backbone)
lengths = _get_bond_lengths(bond_vectors)
angles = _get_bond_angles(bond_vectors)
dihedrals = _get_dihedrals(bond_vectors)
lengths = _get_bond_lengths(bond_vectors) |> vec
angles = _get_bond_angles(bond_vectors) |> vec
dihedrals = _get_dihedrals(bond_vectors) |> vec
return ChainedBonds(lengths, angles, dihedrals)
end

Expand Down
91 changes: 23 additions & 68 deletions src/frames.jl
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
export Frame, Frames
export Frames

using LinearAlgebra
using Rotations: QuatRotation, params
using NNlib

centroid(P::AbstractMatrix{<:Real}) = vec(sum(P, dims=2)) ./ size(P, 2)
centroid(A::AbstractArray{<:Real}; dims=2) = sum(A; dims) ./ size(A, 2)

function kabsch_algorithm(P::AbstractMatrix{T}, Q::AbstractMatrix{T}) where T <: Real
size(P) == size(Q) || throw(ArgumentError("P and Q must have the same size"))
Expand All @@ -20,84 +20,39 @@ function kabsch_algorithm(P::AbstractMatrix{T}, Q::AbstractMatrix{T}) where T <:
return R, P_centroid, Q_centroid
end

"""
Frame{T <: Real}
struct Frames{T<:Real,A<:AbstractArray{T,3},B<:AbstractArray{T,2}}
rotations::A
translations::B

A `Frame` is a combination of a rotation and a translation, which can be applied to a set of coordinates.
"""
struct Frame{T <: Real}
rotation::QuatRotation{T}
location::AbstractVector{T}

function Frame{T}(rotation::QuatRotation{T}, location::AbstractVector{T}) where T <: Real
length(location) == 3 || throw(ArgumentError("location must be a 3-vector"))
return new{T}(rotation, location)
end
end

Frame{T}(rotation::AbstractVecOrMat{T}, location::AbstractVector{T}) where T <: Real = Frame{T}(QuatRotation(rotation), location)
Frame{T}(rotation::AbstractVecOrMat{<:Real}, location::AbstractVector{<:Real}) where T <: Real = Frame{T}(T.(rotation), T.(location))
Frame(rotation::AbstractVecOrMat, location::AbstractVector) = Frame{promote_type(eltype(rotation), eltype(location))}(rotation, location)

Base.:(==)(frame1::Frame, frame2::Frame) = frame1.rotation == frame2.rotation && frame1.location == frame2.location

function (frame::Frame{T})(coords::AbstractMatrix{T}, coords_centroid::AbstractVector{T}=centroid(coords)) where T
return frame.rotation * (coords .- coords_centroid) .+ frame.location
end

"""
Frames{T <: Real, M <: AbstractMatrix{T}} <: AbstractVector{Frame{T}}
The `Frames` type is designed to efficiently store and manipulate the rotation and translation of a set of `Frame`s.
"""
struct Frames{T <: Real, M <: AbstractMatrix{T}} <: AbstractVector{Frame{T}}
rotations::M
locations::M

function Frames{T}(rotations::M, locations::M) where {T <: Real, M <: AbstractMatrix{T}}
size(rotations, 1) == 4 || throw(ArgumentError("rotations must be a 4xN quaternion matrix"))
size(locations, 1) == 3 || throw(ArgumentError("locations must be a 3xN 3D coordinates matrix"))
size(rotations, 2) == size(locations, 2) || throw(ArgumentError("rotations and locations must have the same number of columns"))
return new{T, M}(rotations, locations)
function Frames{T,A,B}(rotations::A, translations::B) where {T,A,B}
size(rotations)[1:2] == (3,3) || throw(ArgumentError("rotations must be a 3x3xL array"))
size(translations, 1) == 3 || throw(ArgumentError("translations must be a 3xN matrix"))
size(rotations, 3) == size(translations, 2) || throw(ArgumentError("rotations and translations must have the same number of columns"))
return new(rotations, translations)
end
end

function Frames{T}(rotmats::AbstractArray{T, 3}, locations::AbstractMatrix{T}) where T <: Real
rotations = stack(params(QuatRotation(rotmat)) for rotmat in eachslice(rotmats, dims=3))
return Frames{T}(rotations, locations)
end
Frames(rotations::A,translations::B) where {T<:Real,A<:AbstractArray{T,3},B<:AbstractMatrix{T}} = Frames{T,A,B}(rotations, translations)

function Frames{T}(rotations::AbstractArray{<:Real}, locations::AbstractMatrix{<:Real}) where T <: Real
return Frames{T}(T.(rotations), T.(locations))
end

function Frames(rotations::AbstractArray{<:Real}, locations::AbstractMatrix{<:Real})
T = promote_type(eltype(rotations), eltype(locations))
return Frames{T}(rotations, locations)
end

Base.size(frames::Frames) = Tuple(size(frames.rotations, 2))
Base.getindex(frames::Frames{T}, i::Integer) where T = Frame{T}(QuatRotation(frames.rotations[:, i]), frames.locations[:, i])

Base.:(==)(frames1::Frames, frames2::Frames) = all(f1 == f2 for (f1, f2) in zip(frames1, frames2))

function (frames::Frames{T})(coords::AbstractMatrix{<:Real}) where T <: Real
coords = T.(coords)
coords_centroid = centroid(coords)
return stack((f -> f(coords, coords_centroid)).(frames))
function Frames(rotations::AbstractArray{<:Real,3}, translations::AbstractArray{<:Real})
T = promote_type(eltype(rotations), eltype(translations))
return Frames(T.(rotations), T.(translations))
end

function Frames(backbone::Backbone{T}, ideal_coords::AbstractMatrix{<:Real}) where T <: Real
backbone = Backbone{T}(backbone.coords)
backbone = Backbone(backbone.coords)
ideal_coords = T.(ideal_coords)
L, r = divrem(length(backbone), size(ideal_coords, 2))
iszero(r) || throw(ArgumentError("backbone length ($(length(backbone))) must be divisible of the number of frame points ($(size(ideal_coords, 2)))"))
rotmats = similar(backbone.coords, 3, 3, L)
locations = similar(backbone.coords, 3, L)
rotations = similar(backbone.coords, 3, 3, L)
translations = similar(backbone.coords, 3, L)
for (i, noisy_coords) in enumerate(eachslice(reshape(backbone.coords, 3, size(ideal_coords, 2), :), dims=3))
rotmats[:, :, i], _, locations[:, i] = kabsch_algorithm(ideal_coords, noisy_coords)
rotations[:, :, i], _, translations[:, i] = kabsch_algorithm(ideal_coords, noisy_coords)
end
return Frames(rotmats, locations)
return Frames(rotations, translations)
end

(frames::Frames{T})(coords::AbstractMatrix{T}) where T<:Real = frames.rotations (coords .- centroid(coords)) .+ reshape(frames.translations, 3, 1, :)
(frames::Frames{T})(coords::AbstractMatrix{<:Real}) where T<:Real = frames(T.(coords))

Backbone(frames::Frames, ideal_coords::AbstractMatrix{<:Real}) = Backbone(frames(ideal_coords))
Loading

2 comments on commit a1d81bf

@AntonOresten
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register

Release notes:

  • Support reading and writing .cif files with readchains and writechains.
  • Hide methods for getting bond lengths, angles, dihedrals from bond_vectors.
  • Stop exporting get_atom_displacements and get_atom_distances.
  • Load proteins with Float64 by default (instead of Float32).
  • Make get_dihedrals and get_bond_angles (hopefully) differentiable. (untested)
  • Remove Frame type in favor of batched multiplication.
  • Make (::Frames)(coords) differentiable

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/111368

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.11.0 -m "<description of version>" a1d81bf5cb324e16a8618b2edd03917cf1f20418
git push origin v0.11.0

Please sign in to comment.