abpytools.core package¶

Subpackages¶

Submodules¶

abpytools.core.base module¶

class abpytools.core.base.CollectionBase[source]¶

Bases: object

CollectionBase is the abpytools base class to develop the collection APIs

classmethod load_from_fasta(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_file(path, n_threads=20, verbose=True, show_progressbar=True, **kwargs)[source]¶

Args:: path: n_threads: int to specify number of threads to use in loading process verbose: bool controls the level of verbose show_progressbar: bool whether to display the progressbar kwargs:

Returns:

classmethod load_from_json(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_pb2(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

save(file_format, path, update=True)[source]¶

Args:: file_format: path: update:

Returns:

save_to_fasta(path, update=True)[source]¶

save_to_json(path, update=True)[source]¶

save_to_pb2(path, update=True)[source]¶

abpytools.core.cache module¶

class abpytools.core.cache.Cache(max_cache_size=10)[source]¶

Bases: object

add(key, data)[source]¶

empty_cache()[source]¶

remove(key)[source]¶

update(key, data, override=True)[source]¶

abpytools.core.chain module¶

class abpytools.core.chain.Chain(sequence, name='Chain1', numbering_scheme='chothia')[source]¶

Bases: object

The Chain object represent a single chain variable fragment (scFv) antibody.

A scFv can be part of either the heavy or light chain of an antibody. The nature of the chain is determined by querying the sequence to the Abnum server, and is implemented with the Chain.ab_numbering() method.

Attributes:: numbering (list): the name of each position occupied by amino acids in sequence mw (float): the cached molecular weight pI (float): the cached isoelectric point of the sequence cdr (tuple): tuple with two dictionaries for CDR and FR with the index of the amino acids in each region germline_identity (dict):

ab_charge(align=True, ph=7.4, pka_database='Wikipedia')[source]¶

Method to calculate the charges for each amino acid of antibody :param pka_database: :param ph: :param align: if set to True an alignment will be performed,

if it hasn’t been done already using the ab_numbering method

Returns:	array with amino acid charges

ab_ec(extinction_coefficient_database='Standard', reduced=False, normalise=False, **kwargs)[source]¶

ab_format()[source]¶

ab_hydrophobicity_matrix(hydrophobicity_scores='ew')[source]¶

ab_molecular_weight(monoisotopic=False)[source]¶

ab_numbering(server='abysis', **kwargs)[source]¶

Return list

Returns:: list:

ab_numbering_table(as_array=False, replacement='-', region='all')[source]¶

Parameters:	region – as_array – if True returns numpy.array object, if False returns a pandas.DataFrame replacement – value to replace empty positions
Returns:

ab_pi(pi_database='Wikipedia')[source]¶

ab_regions()[source]¶

method to determine Chain regions (CDR and Framework) of each amino acid in sequence

Returns:

ab_total_charge(ph=7.4, pka_database='Wikipedia')[source]¶

aligned_sequence¶

chain¶

static determine_chain_type(numbering)[source]¶

load()[source]¶

Generates all the data: - Chain Numbering - Hydrophobicity matrix - Molecular weight - pI

All the data is then stored in its respective attributes

Returns:

classmethod load_from_string(sequence, name='Chain1', numbering_scheme='chothia')[source]¶

Returns an instantiated Chain object from a sequence Args:

sequence: name: numbering_scheme:

Returns:

name¶

numbering_scheme¶

sequence¶

set_name(name)[source]¶

status¶

abpytools.core.chain.amino_acid_charge(amino_acid, ph, pka_values)[source]¶

abpytools.core.chain.calculate_cdr(numbering, cdr_positions, framework_positions)[source]¶

Parameters:	numbering – cdr_positions – framework_positions –
Returns:

abpytools.core.chain.calculate_charge(sequence, ph, pka_values)[source]¶

abpytools.core.chain.calculate_ec(sequence, ec_data)[source]¶

abpytools.core.chain.calculate_hydrophobicity_matrix(whole_sequence, numbering, aa_hydrophobicity_scores, sequence)[source]¶

abpytools.core.chain.calculate_mw(sequence, mw_data)[source]¶

abpytools.core.chain.calculate_pi(sequence, pi_data)[source]¶

abpytools.core.chain.get_ab_numbering(sequence, server, numbering_scheme, timeout=30)[source]¶

Return type:	list

abpytools.core.chain_collection module¶

class abpytools.core.chain_collection.ChainCollection(antibody_objects=None, load=True, **kwargs)[source]¶

Bases: abpytools.core.base.CollectionBase

Object containing Chain objects and to perform analysis on the ensemble.

ab_region_index()[source]¶: method to determine index of amino acids in CDR regions :return: dictionary with names as keys and each value is a dictionary with keys CDR and FR ‘CDR’ entry contains dictionaries with CDR1, CDR2 and CDR3 regions ‘FR’ entry contains dictionaries with FR1, FR2, FR3 and FR4 regions

aligned_sequences¶

append(antibody_obj)[source]¶

chain¶

charge¶

composition(method='count')[source]¶: Amino acid composition of each sequence. Each resulting list is organised alphabetically (see composition.py) :param method: :return:

distance_matrix(feature=None, metric='cosine_similarity', multiprocessing=False)[source]¶: Returns the distance matrix using a given feature and distance metric :param feature: string with the name of the feature to use :param metric: string with the name of the metric to use :param multiprocessing: bool to turn multiprocessing on/off (True/False) :return: list of lists with distances between all sequences of len(data) with each list of len(data)

when i==j M_i,j = 0

extinction_coefficients(extinction_coefficient_database='Standard', reduced=False)[source]¶

Parameters:	extinction_coefficient_database – string with the name of the database to use reduced – bool whether to consider the cysteines to be reduced
Returns:	list

germline¶

germline_identity¶

get_object(name='')[source]¶

Parameters:	name – str
Returns:

hydrophobicity_matrix()[source]¶

igblast_local_query(file_path)[source]¶

igblast_server_query(chunk_size=50, show_progressbar=True, **kwargs)[source]¶

Parameters:	show_progressbar – chunk_size – kwargs – keyword arguments to pass to igblast_options
Returns:

load(show_progressbar=True, n_threads=4, verbose=True)[source]¶

classmethod load_from_fasta(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_json(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_pb2(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

loading_status()[source]¶

molecular_weights(monoisotopic=False)[source]¶

Parameters:	monoisotopic – bool whether to use monoisotopic values
Returns:	list

n_ab¶

names¶

numbering_scheme¶

numbering_table(as_array=False, region='all')[source]¶

pop(index=-1)[source]¶

save_to_fasta(path, update=True)[source]¶

save_to_json(path, update=True)[source]¶

save_to_pb2(path, update=True)[source]¶

sequences¶

set_numbering_scheme(numbering_scheme, realign=True)[source]¶

total_charge¶

abpytools.core.chain_collection.igblast_options(sequences, domain='imgt', germline_db_V='IG_DB/imgt.Homo_sapiens.V.f.orf.p', germline_db_D='IG_DB/imgt.Homo_sapiens.D.f.orf', germline_db_J='IG_DB / imgt.Homo_sapiens.J.f.orf', num_alignments_V=1, num_alignments_D=1, num_alignments_J=1)[source]¶

abpytools.core.chain_collection.load_antibody_object(antibody_object)[source]¶

abpytools.core.chain_collection.load_from_antibody_object(antibody_objects, show_progressbar=True, n_threads=20, verbose=True)[source]¶

Args:: antibody_objects (list): show_progressbar (bool): n_threads (int): verbose (bool):

Returns:

abpytools.core.chain_collection.load_igblast_query(igblast_result, names)[source]¶

Parameters:	names – igblast_result –
Returns:

abpytools.core.chain_collection.make_fasta(names, sequences)[source]¶

abpytools.core.chain_collection.worker(q)[source]¶

abpytools.core.fab module¶

class abpytools.core.fab.Fab(heavy_chain=None, light_chain=None, load=True, name=None)[source]¶

Bases: object

aligned_sequence¶

charge(**kwargs)[source]¶

extinction_coefficient(reduced=False, normalise=False, **kwargs)[source]¶

germline_identity¶

hydrophobicity_matrix(**kwargs)[source]¶

load()[source]¶

molecular_weight(monoisotopic=False)[source]¶

name¶

numbering_table(as_array=False, region='all', chain='both')[source]¶

sequence¶

total_charge(ph=7.4, pka_database='Wikipedia')[source]¶

abpytools.core.fab_collection module¶

class abpytools.core.fab_collection.FabCollection(fab=None, heavy_chains=None, light_chains=None, names=None)[source]¶

Bases: abpytools.core.base.CollectionBase

aligned_sequences¶

charge()[source]¶

extinction_coefficients(extinction_coefficient_database='Standard', reduced=False, normalise=False, **kwargs)[source]¶

germline¶

germline_identity¶

get_object(name)[source]¶

Parameters:	name – str
Returns:

hydrophobicity_matrix()[source]¶

igblast_local_query(file_path, chain)[source]¶

igblast_server_query(**kwargs)[source]¶

classmethod load_from_fasta(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_json(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_pb2(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

molecular_weights(monoisotopic=False)[source]¶

n_ab¶

names¶

numbering_table(as_array=False, region='all', chain='both', **kwargs)[source]¶

regions¶

save_to_fasta(path, update=True)[source]¶

save_to_json(path, update=True)[source]¶

save_to_pb2(path, update=True)[source]¶

sequences¶

total_charge(ph=7.4, pka_database='Wikipedia')[source]¶

abpytools.core.helper_functions module¶

abpytools.core.helper_functions.germline_identity_pd(heavy_identity, light_identity, internal_heavy, internal_light, names)[source]¶

abpytools.core.helper_functions.numbering_table_multiindex(region, whole_sequence_dict)[source]¶

abpytools.core.helper_functions.numbering_table_region(region)[source]¶

abpytools.core.helper_functions.numbering_table_sequences(region, numbering_scheme, chain)[source]¶

abpytools.core.helper_functions.to_numbering_table(as_array, region, chain, heavy_chains_numbering_table, light_chains_numbering_table, names, **kwargs)[source]¶

abpytools.core.utils module¶

abpytools.core.utils.add_Chain_to_protobuf(antibody_obj, proto_obj)[source]¶

Helper function to populate a ProtoChain message. Args:

antibody_obj (Chain): proto_obj (ChainProto):

Returns:

abpytools.core.utils.fasta_ChainCollection_parser(raw_fasta, numbering_scheme)[source]¶

abpytools.core.utils.json_ChainCollection_formatter(chain_objects)[source]¶

Internal function to serialise ChainCollection objects in JSON format Args:

chain_objects (ChainCollection):

Returns:

abpytools.core.utils.json_ChainCollection_parser(raw_data)[source]¶

abpytools.core.utils.json_Chain_parser(antibody_dict, name)[source]¶

abpytools.core.utils.json_FabCollection_formatter(fab_object)[source]¶

Internal function to serialise FabCollection objects in JSON format Args:

fab_object (FabCollection):

Returns:

abpytools.core.utils.json_FabCollection_parser(raw_data)[source]¶

abpytools.core.utils.pb2_ChainCollection_formatter(chain_objects, proto_parser, reset_status=True)[source]¶

Internal function to serialise a ChainCollection object to .pb2 format according to definitition in ‘format/chain.proto’.

Args:: chain_objects (ChainCollection): proto_parser (ChainCollectionProto): reset_status (bool):

Returns:

abpytools.core.utils.pb2_ChainCollection_parser(proto_parser)[source]¶

abpytools.core.utils.pb2_Chain_parser(proto_chain)[source]¶

Populate Chain object from protobuf file

Args:: proto_chain (ChainProto):

Returns:

abpytools.core.utils.pb2_FabCollection_formatter(fab_object, proto_parser, reset_status=True)[source]¶

Internal function to serialise a FabCollection object to .pb2 format according to definitition in ‘format/fab.proto’.

Args:: fab_object (FabCollection): proto_parser (FabCollectionProto): reset_status (bool):

Returns:

abpytools.core.utils.pb2_FabCollection_parser(proto_parser)[source]¶

abpytools.core.utils.pb2_add_chain(chain_object, proto_parser)[source]¶

Populates a protobuf ProtoChain message from Chain object and adds it to ChainCollectionProto Args:

chain_object (Chain): proto_parser (ChainCollectionProto):

Returns:

Module contents¶

class abpytools.core.Chain(sequence, name='Chain1', numbering_scheme='chothia')[source]¶

Bases: object

The Chain object represent a single chain variable fragment (scFv) antibody.

A scFv can be part of either the heavy or light chain of an antibody. The nature of the chain is determined by querying the sequence to the Abnum server, and is implemented with the Chain.ab_numbering() method.

Attributes:: numbering (list): the name of each position occupied by amino acids in sequence mw (float): the cached molecular weight pI (float): the cached isoelectric point of the sequence cdr (tuple): tuple with two dictionaries for CDR and FR with the index of the amino acids in each region germline_identity (dict):

ab_charge(align=True, ph=7.4, pka_database='Wikipedia')[source]¶

Method to calculate the charges for each amino acid of antibody :param pka_database: :param ph: :param align: if set to True an alignment will be performed,

if it hasn’t been done already using the ab_numbering method

Returns:	array with amino acid charges

ab_ec(extinction_coefficient_database='Standard', reduced=False, normalise=False, **kwargs)[source]¶

ab_format()[source]¶

ab_hydrophobicity_matrix(hydrophobicity_scores='ew')[source]¶

ab_molecular_weight(monoisotopic=False)[source]¶

ab_numbering(server='abysis', **kwargs)[source]¶

Return list

Returns:: list:

ab_numbering_table(as_array=False, replacement='-', region='all')[source]¶

Parameters:	region – as_array – if True returns numpy.array object, if False returns a pandas.DataFrame replacement – value to replace empty positions
Returns:

ab_pi(pi_database='Wikipedia')[source]¶

ab_regions()[source]¶

method to determine Chain regions (CDR and Framework) of each amino acid in sequence

Returns:

ab_total_charge(ph=7.4, pka_database='Wikipedia')[source]¶

aligned_sequence¶

chain¶

static determine_chain_type(numbering)[source]¶

load()[source]¶

Generates all the data: - Chain Numbering - Hydrophobicity matrix - Molecular weight - pI

All the data is then stored in its respective attributes

Returns:

classmethod load_from_string(sequence, name='Chain1', numbering_scheme='chothia')[source]¶

Returns an instantiated Chain object from a sequence Args:

sequence: name: numbering_scheme:

Returns:

name¶

numbering_scheme¶

sequence¶

set_name(name)[source]¶

status¶

class abpytools.core.ChainCollection(antibody_objects=None, load=True, **kwargs)[source]¶

Bases: abpytools.core.base.CollectionBase

Object containing Chain objects and to perform analysis on the ensemble.

ab_region_index()[source]¶: method to determine index of amino acids in CDR regions :return: dictionary with names as keys and each value is a dictionary with keys CDR and FR ‘CDR’ entry contains dictionaries with CDR1, CDR2 and CDR3 regions ‘FR’ entry contains dictionaries with FR1, FR2, FR3 and FR4 regions

aligned_sequences¶

append(antibody_obj)[source]¶

chain¶

charge¶

composition(method='count')[source]¶: Amino acid composition of each sequence. Each resulting list is organised alphabetically (see composition.py) :param method: :return:

distance_matrix(feature=None, metric='cosine_similarity', multiprocessing=False)[source]¶: Returns the distance matrix using a given feature and distance metric :param feature: string with the name of the feature to use :param metric: string with the name of the metric to use :param multiprocessing: bool to turn multiprocessing on/off (True/False) :return: list of lists with distances between all sequences of len(data) with each list of len(data)

when i==j M_i,j = 0

extinction_coefficients(extinction_coefficient_database='Standard', reduced=False)[source]¶

Parameters:	extinction_coefficient_database – string with the name of the database to use reduced – bool whether to consider the cysteines to be reduced
Returns:	list

germline¶

germline_identity¶

get_object(name='')[source]¶

Parameters:	name – str
Returns:

hydrophobicity_matrix()[source]¶

igblast_local_query(file_path)[source]¶

igblast_server_query(chunk_size=50, show_progressbar=True, **kwargs)[source]¶

Parameters:	show_progressbar – chunk_size – kwargs – keyword arguments to pass to igblast_options
Returns:

load(show_progressbar=True, n_threads=4, verbose=True)[source]¶

classmethod load_from_fasta(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_json(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_pb2(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

loading_status()[source]¶

molecular_weights(monoisotopic=False)[source]¶

Parameters:	monoisotopic – bool whether to use monoisotopic values
Returns:	list

n_ab¶

names¶

numbering_scheme¶

numbering_table(as_array=False, region='all')[source]¶

pop(index=-1)[source]¶

save_to_fasta(path, update=True)[source]¶

save_to_json(path, update=True)[source]¶

save_to_pb2(path, update=True)[source]¶

sequences¶

set_numbering_scheme(numbering_scheme, realign=True)[source]¶

total_charge¶

class abpytools.core.FabCollection(fab=None, heavy_chains=None, light_chains=None, names=None)[source]¶

Bases: abpytools.core.base.CollectionBase

aligned_sequences¶

charge()[source]¶

extinction_coefficients(extinction_coefficient_database='Standard', reduced=False, normalise=False, **kwargs)[source]¶

germline¶

germline_identity¶

get_object(name)[source]¶

Parameters:	name – str
Returns:

hydrophobicity_matrix()[source]¶

igblast_local_query(file_path, chain)[source]¶

igblast_server_query(**kwargs)[source]¶

classmethod load_from_fasta(path, numbering_scheme='chothia', n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_json(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

classmethod load_from_pb2(path, n_threads=20, verbose=True, show_progressbar=True)[source]¶

molecular_weights(monoisotopic=False)[source]¶

n_ab¶

names¶

numbering_table(as_array=False, region='all', chain='both', **kwargs)[source]¶

regions¶

save_to_fasta(path, update=True)[source]¶

save_to_json(path, update=True)[source]¶

save_to_pb2(path, update=True)[source]¶

sequences¶

total_charge(ph=7.4, pka_database='Wikipedia')[source]¶

class abpytools.core.Fab(heavy_chain=None, light_chain=None, load=True, name=None)[source]¶

Bases: object

aligned_sequence¶

charge(**kwargs)[source]¶

extinction_coefficient(reduced=False, normalise=False, **kwargs)[source]¶

germline_identity¶

hydrophobicity_matrix(**kwargs)[source]¶

load()[source]¶

molecular_weight(monoisotopic=False)[source]¶

name¶

numbering_table(as_array=False, region='all', chain='both')[source]¶

sequence¶

total_charge(ph=7.4, pka_database='Wikipedia')[source]¶

abpytools.core package¶

Subpackages¶

Submodules¶

abpytools.core.base module¶

abpytools.core.cache module¶

abpytools.core.chain module¶

abpytools.core.chain_collection module¶

abpytools.core.fab module¶

abpytools.core.fab_collection module¶

abpytools.core.helper_functions module¶

abpytools.core.utils module¶

Module contents¶

AbPyTools

Navigation

Related Topics