KGX Utils#

Utility methods that are reused across the codebase.

kgx.utils.kgx_utils#

class kgx.utils.kgx_utils.GraphEntityType(value)[source]#

Bases: Enum

An enumeration.

kgx.utils.kgx_utils.apply_edge_filters(graph: BaseGraph, edge_filters: Dict[str, Union[str, Set]]) None[source]#

Apply filters to graph and remove edges that do not pass given filters.

Parameters:
kgx.utils.kgx_utils.apply_filters(graph: BaseGraph, node_filters: Dict[str, Union[str, Set]], edge_filters: Dict[str, Union[str, Set]]) None[source]#

Apply filters to graph and remove nodes and edges that do not pass given filters.

Parameters:
kgx.utils.kgx_utils.apply_graph_operations(graph: BaseGraph, operations: List) None[source]#

Apply graph operations to a given graph.

Parameters:
kgx.utils.kgx_utils.apply_node_filters(graph: BaseGraph, node_filters: Dict[str, Union[str, Set]]) None[source]#

Apply filters to graph and remove nodes that do not pass given filters.

Parameters:
kgx.utils.kgx_utils.build_export_row(data: Dict, list_delimiter: Optional[str] = None) Dict[source]#

Sanitize key-value pairs in dictionary. This should be used to ensure proper syntax and types for node and edge data as it is exported.

Parameters:
  • data (Dict) – A dictionary containing key-value pairs

  • list_delimiter (str) – Optionally provide a delimiter character or string to be used to convert lists into strings.

Returns:

A dictionary containing processed key-value pairs

Return type:

Dict

kgx.utils.kgx_utils.camelcase_to_sentencecase(s: str) str[source]#

Convert CamelCase to sentence case.

Parameters:

s (str) – Input string in CamelCase

Returns:

string in sentence case form

Return type:

str

kgx.utils.kgx_utils.close_connection(conn)[source]#

close a database connection to the SQLite database :return: None

kgx.utils.kgx_utils.contract(uri: str, prefix_maps: Optional[List[Dict]] = None, fallback: bool = True) str[source]#

Contract a given URI to a CURIE, based on mappings from prefix_maps. If no prefix map is provided then will use defaults from prefixcommons-py.

This method will return the URI as the CURIE if there is no mapping found.

Parameters:
  • uri (str) – A URI

  • prefix_maps (Optional[List[Dict]]) – A list of prefix maps to use for mapping

  • fallback (bool) – Determines whether to fallback to default prefix mappings, as determined by prefixcommons.curie_util, when URI prefix is not found in prefix_maps.

Returns:

A CURIE corresponding to the URI

Return type:

str

kgx.utils.kgx_utils.create_connection(db_file)[source]#
create a database connection to the SQLite database

specified by db_file

Parameters:

db_file – database file

Returns:

Connection object or None

kgx.utils.kgx_utils.current_time_in_millis()[source]#

Get current time in milliseconds.

Returns:

Time in milliseconds

Return type:

int

kgx.utils.kgx_utils.expand(curie: str, prefix_maps: Optional[List[dict]] = None, fallback: bool = True) str[source]#

Expand a given CURIE to an URI, based on mappings from prefix_map.

This method will return the CURIE as the IRI if there is no mapping found.

Parameters:
  • curie (str) – A CURIE

  • prefix_maps (Optional[List[dict]]) – A list of prefix maps to use for mapping

  • fallback (bool) – Determines whether to fallback to default prefix mappings, as determined by prefixcommons.curie_util, when CURIE prefix is not found in prefix_maps.

Returns:

A URI corresponding to the CURIE

Return type:

str

Convert a sentence case Biolink category name to a proper Biolink CURIE with the category itself in CamelCase form.

Parameters:

s (str) – Input string in sentence case

Returns:

a proper Biolink CURIE

Return type:

str

kgx.utils.kgx_utils.generate_edge_identifiers(graph: BaseGraph)[source]#

Generate unique identifiers for edges in a graph that do not have an id field.

Parameters:

graph (kgx.graph.base_graph.BaseGraph) –

kgx.utils.kgx_utils.generate_edge_key(s: str, edge_predicate: str, o: str) str[source]#

Generates an edge key based on a given subject, predicate, and object.

Parameters:
  • s (str) – Subject

  • edge_predicate (str) – Edge label

  • o (str) – Object

  • id (str) – Optional identifier that is used as the key if provided

Returns:

Edge key as a string

Return type:

str

kgx.utils.kgx_utils.generate_uuid()[source]#

Generates a UUID.

Returns:

A UUID

Return type:

str

Get ancestors for a given Biolink class.

Parameters:

name (str) –

Returns:

A list of ancestors

Return type:

List

Get Biolink element for a given name, where name can be a class, slot, or relation.

Parameters:

name (str) – The name

Returns:

An instance of linkml_model.meta.Element

Return type:

Optional[linkml_model.meta.Element]

Get all Biolink property types. This includes both node and edges properties.

Returns:

A dict containing all Biolink property and their types

Return type:

Dict

kgx.utils.kgx_utils.get_cache(maxsize=10000)[source]#

Get an instance of cachetools.cache

Parameters:

maxsize (int) – The max size for the cache (10000, by default)

Returns:

An instance of cachetools.cache

Return type:

cachetools.cache

kgx.utils.kgx_utils.get_curie_lookup_service()[source]#

Get an instance of kgx.curie_lookup_service.CurieLookupService

Returns:

An instance of CurieLookupService

Return type:

kgx.curie_lookup_service.CurieLookupService

kgx.utils.kgx_utils.get_prefix_prioritization_map() Dict[str, List][source]#

Get prefix prioritization map as defined in Biolink Model.

Return type:

Dict[str, List]

kgx.utils.kgx_utils.get_toolkit(biolink_release: Optional[str] = None) Toolkit[source]#

Get an instance of bmt.Toolkit If there no instance defined, then one is instantiated and returned.

Parameters:

biolink_release (Optional[str]) – URL to (Biolink) Model Schema to be used for validated (default: None, use default Biolink Model Toolkit schema)

kgx.utils.kgx_utils.get_type_for_property(p: str) str[source]#

Get type for a property.

Parameters:

p (str) –

Returns:

The type for a given property

Return type:

str

kgx.utils.kgx_utils.is_null(item: Any) bool[source]#

Checks if a given item is null or correspond to null.

This method checks for: None, numpy.nan, pandas.NA, pandas.NaT, and ` `

Parameters:

item (Any) – The item to check

Returns:

Whether the given item is null or not

Return type:

bool

kgx.utils.kgx_utils.prepare_data_dict(d1: Dict, d2: Dict, preserve: bool = True) Dict[source]#

Given two dict objects, make a new dict object that is the intersection of the two.

If a key is known to be multivalued then it’s value is converted to a list. If a key is already multivalued then it is updated with new values. If a key is single valued, and a new unique value is found then the existing value is converted to a list and the new value is appended to this list.

Parameters:
  • d1 (Dict) – Dict object

  • d2 (Dict) – Dict object

  • preserve (bool) – Whether or not to preserve values for conflicting keys

Returns:

The intersection of d1 and d2

Return type:

Dict

kgx.utils.kgx_utils.remove_null(input: Any) Any[source]#

Remove any null values from input. :param input: Can be a str, list or dict :type input: Any

Returns:

The input without any null values

Return type:

Any

kgx.utils.kgx_utils.sanitize_import(data: Dict, list_delimiter: Optional[str] = None) Dict[source]#

Sanitize key-value pairs in dictionary. This should be used to ensure proper syntax and types for node and edge data as it is imported.

Parameters:
  • data (Dict) – A dictionary containing key-value pairs

  • list_delimiter (str) – Optionally provide a delimiter character or string to be used to split strings into lists.

Returns:

A dictionary containing processed key-value pairs

Return type:

Dict

kgx.utils.kgx_utils.sentencecase_to_camelcase(s: str) str[source]#

Convert sentence case to CamelCase.

Parameters:

s (str) – Input string in sentence case

Returns:

string in CamelCase form

Return type:

str

kgx.utils.kgx_utils.sentencecase_to_snakecase(s: str) str[source]#

Convert sentence case to snake_case.

Parameters:

s (str) – Input string in sentence case

Returns:

string in snake_case form

Return type:

str

kgx.utils.kgx_utils.snakecase_to_sentencecase(s: str) str[source]#

Convert snake_case to sentence case.

Parameters:

s (str) – Input string in snake_case

Returns:

string in sentence case form

Return type:

str