KGX Utils#
Utility methods that are reused across the codebase.
kgx.utils.kgx_utils#
- kgx.utils.kgx_utils.apply_edge_filters(graph: BaseGraph, edge_filters: Dict[str, Union[str, Set]]) None [source]#
Apply filters to graph and remove edges that do not pass given filters.
- Parameters:
graph (kgx.graph.base_graph.BaseGraph) – The graph
- kgx.utils.kgx_utils.apply_filters(graph: BaseGraph, node_filters: Dict[str, Union[str, Set]], edge_filters: Dict[str, Union[str, Set]]) None [source]#
Apply filters to graph and remove nodes and edges that do not pass given filters.
- Parameters:
graph (kgx.graph.base_graph.BaseGraph) – The graph
- kgx.utils.kgx_utils.apply_graph_operations(graph: BaseGraph, operations: List) None [source]#
Apply graph operations to a given graph.
- Parameters:
graph (kgx.graph.base_graph.BaseGraph) – An instance of BaseGraph
operations (List) – A list of graph operations with configuration
- kgx.utils.kgx_utils.apply_node_filters(graph: BaseGraph, node_filters: Dict[str, Union[str, Set]]) None [source]#
Apply filters to graph and remove nodes that do not pass given filters.
- Parameters:
graph (kgx.graph.base_graph.BaseGraph) – The graph
- kgx.utils.kgx_utils.build_export_row(data: Dict, list_delimiter: Optional[str] = None) Dict [source]#
Sanitize key-value pairs in dictionary. This should be used to ensure proper syntax and types for node and edge data as it is exported.
- Parameters:
data (Dict) – A dictionary containing key-value pairs
list_delimiter (str) – Optionally provide a delimiter character or string to be used to convert lists into strings.
- Returns:
A dictionary containing processed key-value pairs
- Return type:
Dict
- kgx.utils.kgx_utils.camelcase_to_sentencecase(s: str) str [source]#
Convert CamelCase to sentence case.
- kgx.utils.kgx_utils.close_connection(conn)[source]#
close a database connection to the SQLite database :return: None
- kgx.utils.kgx_utils.contract(uri: str, prefix_maps: Optional[List[Dict]] = None, fallback: bool = True) str [source]#
Contract a given URI to a CURIE, based on mappings from prefix_maps. If no prefix map is provided then will use defaults from prefixcommons-py.
This method will return the URI as the CURIE if there is no mapping found.
- Parameters:
- Returns:
A CURIE corresponding to the URI
- Return type:
- kgx.utils.kgx_utils.create_connection(db_file)[source]#
- create a database connection to the SQLite database
specified by db_file
- Parameters:
db_file – database file
- Returns:
Connection object or None
- kgx.utils.kgx_utils.current_time_in_millis()[source]#
Get current time in milliseconds.
- Returns:
Time in milliseconds
- Return type:
- kgx.utils.kgx_utils.expand(curie: str, prefix_maps: Optional[List[dict]] = None, fallback: bool = True) str [source]#
Expand a given CURIE to an URI, based on mappings from prefix_map.
This method will return the CURIE as the IRI if there is no mapping found.
- Parameters:
- Returns:
A URI corresponding to the CURIE
- Return type:
- kgx.utils.kgx_utils.format_biolink_category(s: str) str [source]#
Convert a sentence case Biolink category name to a proper Biolink CURIE with the category itself in CamelCase form.
- kgx.utils.kgx_utils.generate_edge_identifiers(graph: BaseGraph)[source]#
Generate unique identifiers for edges in a graph that do not have an
id
field.- Parameters:
graph (kgx.graph.base_graph.BaseGraph) –
- kgx.utils.kgx_utils.generate_edge_key(s: str, edge_predicate: str, o: str) str [source]#
Generates an edge key based on a given subject, predicate, and object.
- kgx.utils.kgx_utils.get_biolink_ancestors(name: str)[source]#
Get ancestors for a given Biolink class.
- Parameters:
name (str) –
- Returns:
A list of ancestors
- Return type:
List
- kgx.utils.kgx_utils.get_biolink_element(name) Optional[Element] [source]#
Get Biolink element for a given name, where name can be a class, slot, or relation.
- Parameters:
name (str) – The name
- Returns:
An instance of linkml_model.meta.Element
- Return type:
Optional[linkml_model.meta.Element]
- kgx.utils.kgx_utils.get_biolink_property_types() Dict [source]#
Get all Biolink property types. This includes both node and edges properties.
- Returns:
A dict containing all Biolink property and their types
- Return type:
Dict
- kgx.utils.kgx_utils.get_cache(maxsize=10000)[source]#
Get an instance of cachetools.cache
- Parameters:
maxsize (int) – The max size for the cache (
10000
, by default)- Returns:
An instance of cachetools.cache
- Return type:
cachetools.cache
- kgx.utils.kgx_utils.get_curie_lookup_service()[source]#
Get an instance of kgx.curie_lookup_service.CurieLookupService
- Returns:
An instance of
CurieLookupService
- Return type:
- kgx.utils.kgx_utils.get_prefix_prioritization_map() Dict[str, List] [source]#
Get prefix prioritization map as defined in Biolink Model.
- Return type:
Dict[str, List]
- kgx.utils.kgx_utils.get_toolkit(biolink_release: Optional[str] = None) Toolkit [source]#
Get an instance of bmt.Toolkit If there no instance defined, then one is instantiated and returned.
- Parameters:
biolink_release (Optional[str]) – URL to (Biolink) Model Schema to be used for validated (default: None, use default Biolink Model Toolkit schema)
- kgx.utils.kgx_utils.is_null(item: Any) bool [source]#
Checks if a given item is null or correspond to null.
This method checks for:
None
,numpy.nan
,pandas.NA
,pandas.NaT
, and ` `- Parameters:
item (Any) – The item to check
- Returns:
Whether the given item is null or not
- Return type:
- kgx.utils.kgx_utils.prepare_data_dict(d1: Dict, d2: Dict, preserve: bool = True) Dict [source]#
Given two dict objects, make a new dict object that is the intersection of the two.
If a key is known to be multivalued then it’s value is converted to a list. If a key is already multivalued then it is updated with new values. If a key is single valued, and a new unique value is found then the existing value is converted to a list and the new value is appended to this list.
- Parameters:
d1 (Dict) – Dict object
d2 (Dict) – Dict object
preserve (bool) – Whether or not to preserve values for conflicting keys
- Returns:
The intersection of d1 and d2
- Return type:
Dict
- kgx.utils.kgx_utils.remove_null(input: Any) Any [source]#
Remove any null values from input. :param input: Can be a str, list or dict :type input: Any
- Returns:
The input without any null values
- Return type:
Any
- kgx.utils.kgx_utils.sanitize_import(data: Dict, list_delimiter: Optional[str] = None) Dict [source]#
Sanitize key-value pairs in dictionary. This should be used to ensure proper syntax and types for node and edge data as it is imported.
- Parameters:
data (Dict) – A dictionary containing key-value pairs
list_delimiter (str) – Optionally provide a delimiter character or string to be used to split strings into lists.
- Returns:
A dictionary containing processed key-value pairs
- Return type:
Dict
- kgx.utils.kgx_utils.sentencecase_to_camelcase(s: str) str [source]#
Convert sentence case to CamelCase.