Skip to content

Latest commit

 

History

History
17 lines (10 loc) · 1.31 KB

README.md

File metadata and controls

17 lines (10 loc) · 1.31 KB

Extracting human gene families from HGNC

This repository processes the gene family data from HGNC. In the future, the repository may expand its scope to process other types of HGNC data.

Notebooks

  • 1.download.ipynb downloads HGNC data. Check this notebook to see the last modified dates of downloaded files.
  • 2.families.ipynb constructs the gene family ontology in networkx. Annotates gene families with their corresponding Entrez Gene IDs. Gene membership in a family is propagated, e.g. genes belonging to the "Glutamate metabotropic receptors" family also belong to the "Glutamate receptors" family.

Files & Directories

  • download contains unmodified downloads from the EBI FTP site.
  • data contains generated datasets. families.graphml contains a GraphML-formatted network of the HGNC gene family ontology. gene-families.tsv contains the mapping between gene families and Entrez genes.

Questions

Have a question? Submit all feedback or questions via GitHub issues!