KGX CLI#

kgx#

Knowledge Graph Exchange CLI entrypoint.

kgx [OPTIONS] COMMAND [ARGS]...

Options

--version#: Show the version and exit.

graph-summary#

Loads and summarizes a knowledge graph from a set of input files.

Parameters#

inputs: List[str]: Input file
input_format: str: Input file format
input_compression: Optional[str]: The input compression type
output: Optional[str]: Where to write the output (stdout, by default)
report_type: str: The summary get_errors type: “kgx-map” or “meta-knowledge-graph”
report_format: Optional[str]: The summary get_errors format file types: ‘yaml’ or ‘json’ (default is report_type specific)
graph_name: str: User specified name of graph being summarize
node_facet_properties: Optional[List]: A list of node properties from which to generate counts per value for those properties. For example, ['provided_by']
edge_facet_properties: Optional[List]: A list of edge properties from which to generate counts per value for those properties. For example, ['original_knowledge_source', 'aggregator_knowledge_source']
error_log: str: Where to write any graph processing error message (stderr, by default, for empty argument)

kgx graph-summary [OPTIONS] INPUTS...

Options

-i, --input-format <input_format>#: Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-c, --input-compression <input_compression>#: The input compression type

-o, --output <output>#: Required

-r, --report-type <report_type>#: The summary get_errors type. Must be one of (‘kgx-map’, ‘meta-knowledge-graph’)

-f, --report-format <report_format>#: The input format. Can be one of (‘yaml’, ‘json’)

-n, --graph-name <graph_name>#: User specified name of graph being summarized (default: ‘Graph’)

--node-facet-properties <node_facet_properties>#: A list of node properties from which to generate counts per value for those properties

--edge-facet-properties <edge_facet_properties>#: A list of edge properties from which to generate counts per value for those properties

-l, --error-log <error_log>#: File within which to get_errors graph data parsing errors (default: “stderr”)

Arguments

INPUTS#: Required argument(s)

merge#

Load nodes and edges from files and KGs, as defined in a config YAML, and merge them into a single graph. The merged graph can then be written to a local/remote Neo4j instance OR be serialized into a file.

Note

Everything here is driven by the merge-config YAML.

Parameters#

merge_config: str: Merge config YAML
source: List: A list of source to load from the YAML
destination: List: A list of destination to write to, as defined in the YAML
processes: int: Number of processes to use

kgx merge [OPTIONS]

Options

--merge-config <merge_config>#: Required

--source <source>#: Source(s) from the YAML to process

--destination <destination>#: Destination(s) from the YAML to process

-p, --processes <processes>#: Number of processes to use

neo4j-download#

Download nodes and edges from Neo4j database.

Parameters#

uri: str: Neo4j URI. For example, https://localhost:7474
username: str: Username for authentication
password: str: Password for authentication
output: str: Where to write the output (stdout, by default)
output_format: str: The output type (tsv, by default)
output_compression: str: The output compression type
stream: bool: Whether to parse input as a stream
node_filters: Tuple[str, str]: Node filters
edge_filters: Tuple[str, str]: Edge filters

kgx neo4j-download [OPTIONS]

Options

-l, --uri <uri>#: Required Neo4j URI to download from. For example, https://localhost:7474

-u, --username <username>#: Required Neo4j username

-p, --password <password>#: Required Neo4j password

-o, --output <output>#: Required Output

-f, --output-format <output_format>#: Required The output format. Can be one of (‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘neo4j’, ‘nt’, ‘null’, ‘sql’, ‘tsv’, ‘parquet’)

-d, --output-compression <output_compression>#: The output compression type

-s, --stream#: Parse input as a stream

-n, --node-filters <node_filters>#: Filters for filtering nodes from the input graph

-e, --edge-filters <edge_filters>#: Filters for filtering edges from the input graph

neo4j-upload#

Upload a set of nodes/edges to a Neo4j database.

Parameters#

inputs: List[str]: A list of files that contains nodes/edges
input_format: str: The input format
input_compression: str: The input compression type
uri: str: The full HTTP address for Neo4j database
username: str: Username for authentication
password: str: Password for authentication
stream: bool: Whether to parse input as a stream
node_filters: Tuple[str, str]: Node filters
edge_filters: Tuple[str, str]: Edge filters

kgx neo4j-upload [OPTIONS] INPUTS...

Options

-i, --input-format <input_format>#: Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-c, --input-compression <input_compression>#: The input compression type

-l, --uri <uri>#: Required Neo4j URI to upload to. For example, https://localhost:7474

-u, --username <username>#: Required Neo4j username

-p, --password <password>#: Required Neo4j password

-s, --stream#: Parse input as a stream

-n, --node-filters <node_filters>#: Filters for filtering nodes from the input graph

-e, --edge-filters <edge_filters>#: Filters for filtering edges from the input graph

Arguments

INPUTS#: Required argument(s)

transform#

Transform a Knowledge Graph from one serialization form to another.

Parameters#

inputs: List[str]: A list of files that contains nodes/edges
input_format: str: The input format
input_compression: str: The input compression type
output: str: The output file
output_format: str: The output format
output_compression: str: The output compression typ
stream: bool: Whether or not to stream
node_filters: Optional[List[Tuple[str, str]]]: Node input filters
edge_filters: Optional[List[Tuple[str, str]]]: Edge input filters
transform_config: str: Transform config YAML
source: List: A list of source(s) to load from the YAML
knowledge_sources: Optional[List[Tuple[str, str]]]: A list of named knowledge sources with (string, boolean or tuple rewrite) specification
infores_catalog: Optional[str]: Optional dump of a TSV file of InfoRes CURIE to Knowledge Source mappings
processes: int: Number of processes to use

kgx transform [OPTIONS] [INPUTS]...

Options

-i, --input-format <input_format>#: The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-c, --input-compression <input_compression>#: The input compression type

-o, --output <output>#: Output

-f, --output-format <output_format>#: The output format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-d, --output-compression <output_compression>#: The output compression type

--stream#: Parse input as a stream

-n, --node-filters <node_filters>#: Filters for filtering nodes from the input graph

-e, --edge-filters <edge_filters>#: Filters for filtering edges from the input graph

--transform-config <transform_config>#: Transform config YAML

--source <source>#: Source(s) from the YAML to process

-k, --knowledge-sources <knowledge_sources>#: A named knowledge source with (string, boolean or tuple rewrite) specification

--infores-catalog <infores_catalog>#: Optional dump of a CSV file of InfoRes CURIE to Knowledge Source mappings

-p, --processes <processes>#: Number of processes to use

Arguments

INPUTS#: Optional argument(s)

validate#

Run KGX validator on an input file to check for Biolink Model compliance.

Parameters#

inputs: List[str]: Input files
input_format: str: The input format
input_compression: str: The input compression type
output: str: Path to output file
biolink_release: Optional[str]: SemVer version of Biolink Model Release used for validation (default: latest Biolink Model Toolkit version)

kgx validate [OPTIONS] INPUTS...

Options

-i, --input-format <input_format>#: Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-c, --input-compression <input_compression>#: The input compression type

-o, --output <output>#: File to write validation reports to

-b, --biolink-release <biolink_release>#: Biolink Model Release (SemVer) used for validation (default: latest Biolink Model Toolkit version)

Arguments

INPUTS#: Required argument(s)