KGX CLI#
kgx#
Knowledge Graph Exchange CLI entrypoint.
kgx [OPTIONS] COMMAND [ARGS]...
Options
- --version#
Show the version and exit.
graph-summary#
Loads and summarizes a knowledge graph from a set of input files.
Parameters#
- inputs: List[str]
Input file
- input_format: str
Input file format
- input_compression: Optional[str]
The input compression type
- output: Optional[str]
Where to write the output (stdout, by default)
- report_type: str
The summary get_errors type: “kgx-map” or “meta-knowledge-graph”
- report_format: Optional[str]
The summary get_errors format file types: ‘yaml’ or ‘json’ (default is report_type specific)
- graph_name: str
User specified name of graph being summarize
- node_facet_properties: Optional[List]
A list of node properties from which to generate counts per value for those properties. For example,
['provided_by']
- edge_facet_properties: Optional[List]
A list of edge properties from which to generate counts per value for those properties. For example,
['original_knowledge_source', 'aggregator_knowledge_source']
- error_log: str
Where to write any graph processing error message (stderr, by default, for empty argument)
kgx graph-summary [OPTIONS] INPUTS...
Options
- -i, --input-format <input_format>#
Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>#
The input compression type
- -o, --output <output>#
Required
- -r, --report-type <report_type>#
The summary get_errors type. Must be one of (‘kgx-map’, ‘meta-knowledge-graph’)
- -f, --report-format <report_format>#
The input format. Can be one of (‘yaml’, ‘json’)
- -n, --graph-name <graph_name>#
User specified name of graph being summarized (default: ‘Graph’)
- --node-facet-properties <node_facet_properties>#
A list of node properties from which to generate counts per value for those properties
- --edge-facet-properties <edge_facet_properties>#
A list of edge properties from which to generate counts per value for those properties
- -l, --error-log <error_log>#
File within which to get_errors graph data parsing errors (default: “stderr”)
Arguments
- INPUTS#
Required argument(s)
merge#
Load nodes and edges from files and KGs, as defined in a config YAML, and merge them into a single graph. The merged graph can then be written to a local/remote Neo4j instance OR be serialized into a file.
Note
Everything here is driven by the merge-config
YAML.
Parameters#
- merge_config: str
Merge config YAML
- source: List
A list of source to load from the YAML
- destination: List
A list of destination to write to, as defined in the YAML
- processes: int
Number of processes to use
kgx merge [OPTIONS]
Options
- --merge-config <merge_config>#
Required
- --source <source>#
Source(s) from the YAML to process
- --destination <destination>#
Destination(s) from the YAML to process
- -p, --processes <processes>#
Number of processes to use
neo4j-download#
Download nodes and edges from Neo4j database.
Parameters#
- uri: str
Neo4j URI. For example, https://localhost:7474
- username: str
Username for authentication
- password: str
Password for authentication
- output: str
Where to write the output (stdout, by default)
- output_format: str
The output type (
tsv
, by default)- output_compression: str
The output compression type
- stream: bool
Whether to parse input as a stream
- node_filters: Tuple[str, str]
Node filters
- edge_filters: Tuple[str, str]
Edge filters
kgx neo4j-download [OPTIONS]
Options
- -l, --uri <uri>#
Required Neo4j URI to download from. For example, https://localhost:7474
- -u, --username <username>#
Required Neo4j username
- -p, --password <password>#
Required Neo4j password
- -o, --output <output>#
Required Output
- -f, --output-format <output_format>#
Required The output format. Can be one of (‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘neo4j’, ‘nt’, ‘null’, ‘sql’, ‘tsv’, ‘parquet’)
- -d, --output-compression <output_compression>#
The output compression type
- -s, --stream#
Parse input as a stream
- -n, --node-filters <node_filters>#
Filters for filtering nodes from the input graph
- -e, --edge-filters <edge_filters>#
Filters for filtering edges from the input graph
neo4j-upload#
Upload a set of nodes/edges to a Neo4j database.
Parameters#
- inputs: List[str]
A list of files that contains nodes/edges
- input_format: str
The input format
- input_compression: str
The input compression type
- uri: str
The full HTTP address for Neo4j database
- username: str
Username for authentication
- password: str
Password for authentication
- stream: bool
Whether to parse input as a stream
- node_filters: Tuple[str, str]
Node filters
- edge_filters: Tuple[str, str]
Edge filters
kgx neo4j-upload [OPTIONS] INPUTS...
Options
- -i, --input-format <input_format>#
Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>#
The input compression type
- -l, --uri <uri>#
Required Neo4j URI to upload to. For example, https://localhost:7474
- -u, --username <username>#
Required Neo4j username
- -p, --password <password>#
Required Neo4j password
- -s, --stream#
Parse input as a stream
- -n, --node-filters <node_filters>#
Filters for filtering nodes from the input graph
- -e, --edge-filters <edge_filters>#
Filters for filtering edges from the input graph
Arguments
- INPUTS#
Required argument(s)
transform#
Transform a Knowledge Graph from one serialization form to another.
Parameters#
- inputs: List[str]
A list of files that contains nodes/edges
- input_format: str
The input format
- input_compression: str
The input compression type
- output: str
The output file
- output_format: str
The output format
- output_compression: str
The output compression typ
- stream: bool
Whether or not to stream
- node_filters: Optional[List[Tuple[str, str]]]
Node input filters
- edge_filters: Optional[List[Tuple[str, str]]]
Edge input filters
- transform_config: str
Transform config YAML
- source: List
A list of source(s) to load from the YAML
- knowledge_sources: Optional[List[Tuple[str, str]]]
A list of named knowledge sources with (string, boolean or tuple rewrite) specification
- infores_catalog: Optional[str]
Optional dump of a TSV file of InfoRes CURIE to Knowledge Source mappings
- processes: int
Number of processes to use
kgx transform [OPTIONS] [INPUTS]...
Options
- -i, --input-format <input_format>#
The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>#
The input compression type
- -o, --output <output>#
Output
- -f, --output-format <output_format>#
The output format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)
- -d, --output-compression <output_compression>#
The output compression type
- --stream#
Parse input as a stream
- -n, --node-filters <node_filters>#
Filters for filtering nodes from the input graph
- -e, --edge-filters <edge_filters>#
Filters for filtering edges from the input graph
- --transform-config <transform_config>#
Transform config YAML
- --source <source>#
Source(s) from the YAML to process
- -k, --knowledge-sources <knowledge_sources>#
A named knowledge source with (string, boolean or tuple rewrite) specification
- --infores-catalog <infores_catalog>#
Optional dump of a CSV file of InfoRes CURIE to Knowledge Source mappings
- -p, --processes <processes>#
Number of processes to use
Arguments
- INPUTS#
Optional argument(s)
validate#
Run KGX validator on an input file to check for Biolink Model compliance.
Parameters#
- inputs: List[str]
Input files
- input_format: str
The input format
- input_compression: str
The input compression type
- output: str
Path to output file
- biolink_release: Optional[str]
SemVer version of Biolink Model Release used for validation (default: latest Biolink Model Toolkit version)
kgx validate [OPTIONS] INPUTS...
Options
- -i, --input-format <input_format>#
Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘nt’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>#
The input compression type
- -o, --output <output>#
File to write validation reports to
- -b, --biolink-release <biolink_release>#
Biolink Model Release (SemVer) used for validation (default: latest Biolink Model Toolkit version)
Arguments
- INPUTS#
Required argument(s)