Skip to content

uniprotlib

Python library for parsing UniProt XML files and ID mapping data. Handles both single-entry downloads and multi-GB gzip-compressed database dumps with bounded memory usage.

Installation

pip install uniprotlib

Or with uv:

uv add uniprotlib

Quick start

Parse UniProt XML

from uniprotlib import parse_xml

for entry in parse_xml("uniprot_sprot.xml.gz"):
    print(entry.primary_accession, entry.protein_name)
    print(entry.organism.scientific_name, entry.organism.tax_id)

Parse ID mappings

from uniprotlib import parse_idmapping

for m in parse_idmapping("idmapping.dat.gz", id_type="GeneID"):
    print(m.accession, m.id)

See the UniProt XML and ID Mapping usage guides for more examples, or the API Reference for full details.