abpytools.core package¶
Submodules¶
abpytools.core.base module¶
-
class
abpytools.core.base.
CollectionBase
[source]¶ Bases:
object
CollectionBase is the abpytools base class to develop the collection APIs
-
classmethod
load_from_fasta
(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶
-
classmethod
load_from_file
(path, n_threads=20, verbose=True, show_progressbar=True, **kwargs)[source]¶ - Args:
- path: n_threads: int to specify number of threads to use in loading process verbose: bool controls the level of verbose show_progressbar: bool whether to display the progressbar kwargs:
Returns:
-
classmethod
abpytools.core.cache module¶
abpytools.core.chain module¶
-
class
abpytools.core.chain.
Chain
(sequence, name='Chain1', numbering_scheme='chothia')[source]¶ Bases:
object
The Chain object represent a single chain variable fragment (scFv) antibody.
A scFv can be part of either the heavy or light chain of an antibody. The nature of the chain is determined by querying the sequence to the Abnum server, and is implemented with the Chain.ab_numbering() method.
- Attributes:
- numbering (list): the name of each position occupied by amino acids in sequence mw (float): the cached molecular weight pI (float): the cached isoelectric point of the sequence cdr (tuple): tuple with two dictionaries for CDR and FR with the index of the amino acids in each region germline_identity (dict):
-
ab_charge
(align=True, ph=7.4, pka_database='Wikipedia')[source]¶ Method to calculate the charges for each amino acid of antibody :param pka_database: :param ph: :param align: if set to True an alignment will be performed,
if it hasn’t been done already using the ab_numbering methodReturns: array with amino acid charges
-
ab_ec
(extinction_coefficient_database='Standard', reduced=False, normalise=False, **kwargs)[source]¶
-
ab_numbering_table
(as_array=False, replacement='-', region='all')[source]¶ Parameters: - region –
- as_array – if True returns numpy.array object, if False returns a pandas.DataFrame
- replacement – value to replace empty positions
Returns:
-
ab_regions
()[source]¶ method to determine Chain regions (CDR and Framework) of each amino acid in sequence
Returns:
-
aligned_sequence
¶
-
chain
¶
-
load
()[source]¶ Generates all the data: - Chain Numbering - Hydrophobicity matrix - Molecular weight - pI
All the data is then stored in its respective attributes
Returns:
-
classmethod
load_from_string
(sequence, name='Chain1', numbering_scheme='chothia')[source]¶ Returns an instantiated Chain object from a sequence Args:
sequence: name: numbering_scheme:Returns:
-
name
¶
-
numbering_scheme
¶
-
sequence
¶
-
status
¶
-
abpytools.core.chain.
calculate_cdr
(numbering, cdr_positions, framework_positions)[source]¶ Parameters: - numbering –
- cdr_positions –
- framework_positions –
Returns:
abpytools.core.chain_collection module¶
-
class
abpytools.core.chain_collection.
ChainCollection
(antibody_objects=None, load=True, **kwargs)[source]¶ Bases:
abpytools.core.base.CollectionBase
Object containing Chain objects and to perform analysis on the ensemble.
-
ab_region_index
()[source]¶ method to determine index of amino acids in CDR regions :return: dictionary with names as keys and each value is a dictionary with keys CDR and FR ‘CDR’ entry contains dictionaries with CDR1, CDR2 and CDR3 regions ‘FR’ entry contains dictionaries with FR1, FR2, FR3 and FR4 regions
-
aligned_sequences
¶
-
chain
¶
-
charge
¶
-
composition
(method='count')[source]¶ Amino acid composition of each sequence. Each resulting list is organised alphabetically (see composition.py) :param method: :return:
-
distance_matrix
(feature=None, metric='cosine_similarity', multiprocessing=False)[source]¶ Returns the distance matrix using a given feature and distance metric :param feature: string with the name of the feature to use :param metric: string with the name of the metric to use :param multiprocessing: bool to turn multiprocessing on/off (True/False) :return: list of lists with distances between all sequences of len(data) with each list of len(data)
when i==j M_i,j = 0
-
extinction_coefficients
(extinction_coefficient_database='Standard', reduced=False)[source]¶ Parameters: - extinction_coefficient_database – string with the name of the database to use
- reduced – bool whether to consider the cysteines to be reduced
Returns: list
-
germline
¶
-
germline_identity
¶
-
igblast_server_query
(chunk_size=50, show_progressbar=True, **kwargs)[source]¶ Parameters: - show_progressbar –
- chunk_size –
- kwargs – keyword arguments to pass to igblast_options
Returns:
-
classmethod
load_from_fasta
(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶
-
molecular_weights
(monoisotopic=False)[source]¶ Parameters: monoisotopic – bool whether to use monoisotopic values Returns: list
-
n_ab
¶
-
names
¶
-
numbering_scheme
¶
-
sequences
¶
-
total_charge
¶
-
-
abpytools.core.chain_collection.
igblast_options
(sequences, domain='imgt', germline_db_V='IG_DB/imgt.Homo_sapiens.V.f.orf.p', germline_db_D='IG_DB/imgt.Homo_sapiens.D.f.orf', germline_db_J='IG_DB / imgt.Homo_sapiens.J.f.orf', num_alignments_V=1, num_alignments_D=1, num_alignments_J=1)[source]¶
-
abpytools.core.chain_collection.
load_from_antibody_object
(antibody_objects, show_progressbar=True, n_threads=20, verbose=True)[source]¶ - Args:
- antibody_objects (list): show_progressbar (bool): n_threads (int): verbose (bool):
Returns:
abpytools.core.fab module¶
abpytools.core.fab_collection module¶
-
class
abpytools.core.fab_collection.
FabCollection
(fab=None, heavy_chains=None, light_chains=None, names=None)[source]¶ Bases:
abpytools.core.base.CollectionBase
-
aligned_sequences
¶
-
extinction_coefficients
(extinction_coefficient_database='Standard', reduced=False, normalise=False, **kwargs)[source]¶
-
germline
¶
-
germline_identity
¶
-
classmethod
load_from_fasta
(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶
-
n_ab
¶
-
names
¶
-
regions
¶
-
sequences
¶
-
abpytools.core.helper_functions module¶
abpytools.core.utils module¶
-
abpytools.core.utils.
add_Chain_to_protobuf
(antibody_obj, proto_obj)[source]¶ Helper function to populate a ProtoChain message. Args:
antibody_obj (Chain): proto_obj (ChainProto):Returns:
-
abpytools.core.utils.
json_ChainCollection_formatter
(chain_objects)[source]¶ Internal function to serialise ChainCollection objects in JSON format Args:
chain_objects (ChainCollection):Returns:
-
abpytools.core.utils.
json_FabCollection_formatter
(fab_object)[source]¶ Internal function to serialise FabCollection objects in JSON format Args:
fab_object (FabCollection):Returns:
-
abpytools.core.utils.
pb2_ChainCollection_formatter
(chain_objects, proto_parser, reset_status=True)[source]¶ Internal function to serialise a ChainCollection object to .pb2 format according to definitition in ‘format/chain.proto’.
- Args:
- chain_objects (ChainCollection): proto_parser (ChainCollectionProto): reset_status (bool):
Returns:
-
abpytools.core.utils.
pb2_Chain_parser
(proto_chain)[source]¶ Populate Chain object from protobuf file
- Args:
- proto_chain (ChainProto):
Returns:
-
abpytools.core.utils.
pb2_FabCollection_formatter
(fab_object, proto_parser, reset_status=True)[source]¶ Internal function to serialise a FabCollection object to .pb2 format according to definitition in ‘format/fab.proto’.
- Args:
- fab_object (FabCollection): proto_parser (FabCollectionProto): reset_status (bool):
Returns:
Module contents¶
-
class
abpytools.core.
Chain
(sequence, name='Chain1', numbering_scheme='chothia')[source]¶ Bases:
object
The Chain object represent a single chain variable fragment (scFv) antibody.
A scFv can be part of either the heavy or light chain of an antibody. The nature of the chain is determined by querying the sequence to the Abnum server, and is implemented with the Chain.ab_numbering() method.
- Attributes:
- numbering (list): the name of each position occupied by amino acids in sequence mw (float): the cached molecular weight pI (float): the cached isoelectric point of the sequence cdr (tuple): tuple with two dictionaries for CDR and FR with the index of the amino acids in each region germline_identity (dict):
-
ab_charge
(align=True, ph=7.4, pka_database='Wikipedia')[source]¶ Method to calculate the charges for each amino acid of antibody :param pka_database: :param ph: :param align: if set to True an alignment will be performed,
if it hasn’t been done already using the ab_numbering methodReturns: array with amino acid charges
-
ab_ec
(extinction_coefficient_database='Standard', reduced=False, normalise=False, **kwargs)[source]¶
-
ab_numbering_table
(as_array=False, replacement='-', region='all')[source]¶ Parameters: - region –
- as_array – if True returns numpy.array object, if False returns a pandas.DataFrame
- replacement – value to replace empty positions
Returns:
-
ab_regions
()[source]¶ method to determine Chain regions (CDR and Framework) of each amino acid in sequence
Returns:
-
aligned_sequence
¶
-
chain
¶
-
load
()[source]¶ Generates all the data: - Chain Numbering - Hydrophobicity matrix - Molecular weight - pI
All the data is then stored in its respective attributes
Returns:
-
classmethod
load_from_string
(sequence, name='Chain1', numbering_scheme='chothia')[source]¶ Returns an instantiated Chain object from a sequence Args:
sequence: name: numbering_scheme:Returns:
-
name
¶
-
numbering_scheme
¶
-
sequence
¶
-
status
¶
-
class
abpytools.core.
ChainCollection
(antibody_objects=None, load=True, **kwargs)[source]¶ Bases:
abpytools.core.base.CollectionBase
Object containing Chain objects and to perform analysis on the ensemble.
-
ab_region_index
()[source]¶ method to determine index of amino acids in CDR regions :return: dictionary with names as keys and each value is a dictionary with keys CDR and FR ‘CDR’ entry contains dictionaries with CDR1, CDR2 and CDR3 regions ‘FR’ entry contains dictionaries with FR1, FR2, FR3 and FR4 regions
-
aligned_sequences
¶
-
chain
¶
-
charge
¶
-
composition
(method='count')[source]¶ Amino acid composition of each sequence. Each resulting list is organised alphabetically (see composition.py) :param method: :return:
-
distance_matrix
(feature=None, metric='cosine_similarity', multiprocessing=False)[source]¶ Returns the distance matrix using a given feature and distance metric :param feature: string with the name of the feature to use :param metric: string with the name of the metric to use :param multiprocessing: bool to turn multiprocessing on/off (True/False) :return: list of lists with distances between all sequences of len(data) with each list of len(data)
when i==j M_i,j = 0
-
extinction_coefficients
(extinction_coefficient_database='Standard', reduced=False)[source]¶ Parameters: - extinction_coefficient_database – string with the name of the database to use
- reduced – bool whether to consider the cysteines to be reduced
Returns: list
-
germline
¶
-
germline_identity
¶
-
igblast_server_query
(chunk_size=50, show_progressbar=True, **kwargs)[source]¶ Parameters: - show_progressbar –
- chunk_size –
- kwargs – keyword arguments to pass to igblast_options
Returns:
-
classmethod
load_from_fasta
(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶
-
molecular_weights
(monoisotopic=False)[source]¶ Parameters: monoisotopic – bool whether to use monoisotopic values Returns: list
-
n_ab
¶
-
names
¶
-
numbering_scheme
¶
-
sequences
¶
-
total_charge
¶
-
-
class
abpytools.core.
FabCollection
(fab=None, heavy_chains=None, light_chains=None, names=None)[source]¶ Bases:
abpytools.core.base.CollectionBase
-
aligned_sequences
¶
-
extinction_coefficients
(extinction_coefficient_database='Standard', reduced=False, normalise=False, **kwargs)[source]¶
-
germline
¶
-
germline_identity
¶
-
classmethod
load_from_fasta
(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶
-
n_ab
¶
-
names
¶
-
regions
¶
-
sequences
¶
-