KGX CLI#

kgx#

Knowledge Graph Exchange CLI entrypoint.

kgx [OPTIONS] COMMAND [ARGS]...

Options

--version#

Show the version and exit.

graph-summary#

Loads and summarizes a knowledge graph from a set of input files.

Parameters#

inputs: List[str]

Input file

input_format: str

Input file format

input_compression: Optional[str]

The input compression type

output: Optional[str]

Where to write the output (stdout, by default)

report_type: str

The summary get_errors type: “kgx-map” or “meta-knowledge-graph”

report_format: Optional[str]

The summary get_errors format file types: ‘yaml’ or ‘json’ (default is report_type specific)

graph_name: str

User specified name of graph being summarize

node_facet_properties: Optional[List]

A list of node properties from which to generate counts per value for those properties. For example, ['provided_by']

edge_facet_properties: Optional[List]

A list of edge properties from which to generate counts per value for those properties. For example, ['original_knowledge_source', 'aggregator_knowledge_source']

error_log: str

Where to write any graph processing error message (stderr, by default, for empty argument)

kgx graph-summary [OPTIONS] INPUTS...

Options

-i, --input-format <input_format>#

Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-c, --input-compression <input_compression>#

The input compression type

-o, --output <output>#

Required

-r, --report-type <report_type>#

The summary get_errors type. Must be one of (‘kgx-map’, ‘meta-knowledge-graph’)

-f, --report-format <report_format>#

The input format. Can be one of (‘yaml’, ‘json’)

-n, --graph-name <graph_name>#

User specified name of graph being summarized (default: ‘Graph’)

--node-facet-properties <node_facet_properties>#

A list of node properties from which to generate counts per value for those properties

--edge-facet-properties <edge_facet_properties>#

A list of edge properties from which to generate counts per value for those properties

-l, --error-log <error_log>#

File within which to get_errors graph data parsing errors (default: “stderr”)

Arguments

INPUTS#

Required argument(s)

merge#

Load nodes and edges from files and KGs, as defined in a config YAML, and merge them into a single graph. The merged graph can then be written to a local/remote Neo4j instance OR be serialized into a file.

Note

Everything here is driven by the merge-config YAML.

Parameters#

merge_config: str

Merge config YAML

source: List

A list of source to load from the YAML

destination: List

A list of destination to write to, as defined in the YAML

processes: int

Number of processes to use

kgx merge [OPTIONS]

Options

--merge-config <merge_config>#

Required

--source <source>#

Source(s) from the YAML to process

--destination <destination>#

Destination(s) from the YAML to process

-p, --processes <processes>#

Number of processes to use

neo4j-download#

Download nodes and edges from Neo4j database.

Parameters#

uri: str

Neo4j URI. For example, https://localhost:7474

username: str

Username for authentication

password: str

Password for authentication

output: str

Where to write the output (stdout, by default)

output_format: str

The output type (tsv, by default)

output_compression: str

The output compression type

stream: bool

Whether to parse input as a stream

node_filters: Tuple[str, str]

Node filters

edge_filters: Tuple[str, str]

Edge filters

kgx neo4j-download [OPTIONS]

Options

-l, --uri <uri>#

Required Neo4j URI to download from. For example, https://localhost:7474

-u, --username <username>#

Required Neo4j username

-p, --password <password>#

Required Neo4j password

-o, --output <output>#

Required Output

-f, --output-format <output_format>#

Required The output format. Can be one of (‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘neo4j’, ‘nt’, ‘null’, ‘sql’, ‘tsv’, ‘parquet’)

-d, --output-compression <output_compression>#

The output compression type

-s, --stream#

Parse input as a stream

-n, --node-filters <node_filters>#

Filters for filtering nodes from the input graph

-e, --edge-filters <edge_filters>#

Filters for filtering edges from the input graph

neo4j-upload#

Upload a set of nodes/edges to a Neo4j database.

Parameters#

inputs: List[str]

A list of files that contains nodes/edges

input_format: str

The input format

input_compression: str

The input compression type

uri: str

The full HTTP address for Neo4j database

username: str

Username for authentication

password: str

Password for authentication

stream: bool

Whether to parse input as a stream

node_filters: Tuple[str, str]

Node filters

edge_filters: Tuple[str, str]

Edge filters

kgx neo4j-upload [OPTIONS] INPUTS...

Options

-i, --input-format <input_format>#

Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-c, --input-compression <input_compression>#

The input compression type

-l, --uri <uri>#

Required Neo4j URI to upload to. For example, https://localhost:7474

-u, --username <username>#

Required Neo4j username

-p, --password <password>#

Required Neo4j password

-s, --stream#

Parse input as a stream

-n, --node-filters <node_filters>#

Filters for filtering nodes from the input graph

-e, --edge-filters <edge_filters>#

Filters for filtering edges from the input graph

Arguments

INPUTS#

Required argument(s)

transform#

Transform a Knowledge Graph from one serialization form to another.

Parameters#

inputs: List[str]

A list of files that contains nodes/edges

input_format: str

The input format

input_compression: str

The input compression type

output: str

The output file

output_format: str

The output format

output_compression: str

The output compression typ

stream: bool

Whether or not to stream

node_filters: Optional[List[Tuple[str, str]]]

Node input filters

edge_filters: Optional[List[Tuple[str, str]]]

Edge input filters

transform_config: str

Transform config YAML

source: List

A list of source(s) to load from the YAML

knowledge_sources: Optional[List[Tuple[str, str]]]

A list of named knowledge sources with (string, boolean or tuple rewrite) specification

infores_catalog: Optional[str]

Optional dump of a TSV file of InfoRes CURIE to Knowledge Source mappings

processes: int

Number of processes to use

kgx transform [OPTIONS] [INPUTS]...

Options

-i, --input-format <input_format>#

The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-c, --input-compression <input_compression>#

The input compression type

-o, --output <output>#

Output

-f, --output-format <output_format>#

The output format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-d, --output-compression <output_compression>#

The output compression type

--stream#

Parse input as a stream

-n, --node-filters <node_filters>#

Filters for filtering nodes from the input graph

-e, --edge-filters <edge_filters>#

Filters for filtering edges from the input graph

--transform-config <transform_config>#

Transform config YAML

--source <source>#

Source(s) from the YAML to process

-k, --knowledge-sources <knowledge_sources>#

A named knowledge source with (string, boolean or tuple rewrite) specification

--infores-catalog <infores_catalog>#

Optional dump of a CSV file of InfoRes CURIE to Knowledge Source mappings

-p, --processes <processes>#

Number of processes to use

Arguments

INPUTS#

Optional argument(s)

validate#

Run KGX validator on an input file to check for Biolink Model compliance.

Parameters#

inputs: List[str]

Input files

input_format: str

The input format

input_compression: str

The input compression type

output: str

Path to output file

biolink_release: Optional[str]

SemVer version of Biolink Model Release used for validation (default: latest Biolink Model Toolkit version)

kgx validate [OPTIONS] INPUTS...

Options

-i, --input-format <input_format>#

Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)

-c, --input-compression <input_compression>#

The input compression type

-o, --output <output>#

File to write validation reports to

-b, --biolink-release <biolink_release>#

Biolink Model Release (SemVer) used for validation (default: latest Biolink Model Toolkit version)

Arguments

INPUTS#

Required argument(s)