KGX CLI¶
kgx¶
Knowledge Graph Exchange CLI entrypoint.
Usage
kgx [OPTIONS] COMMAND [ARGS]...
Options
- --version¶
Show the version and exit.
arangodb-download¶
Download nodes and edges from an ArangoDB database.
Parameters¶
- uri: str
ArangoDB URI. For example, http://localhost:8529
- database: str
The database name
- username: str
Username for authentication
- password: str
Password for authentication
- output: str
Where to write the output (stdout, by default)
- output_format: str
The output type (
tsv, by default)- output_compression: str
The output compression type
- stream: bool
Whether to parse input as a stream
- node_filters: Tuple[str, str]
Node filters
- edge_filters: Tuple[str, str]
Edge filters
- node_collection: Tuple[str]
Names of vertex collections
- edge_collection: Tuple[str]
Names of edge collections
- all_collections: bool
Whether to discover and export all non-system collections
Usage
kgx arangodb-download [OPTIONS]
Options
- -l, --uri <uri>¶
Required ArangoDB URI to download from. For example, http://localhost:8529
- -d, --database <database>¶
Required ArangoDB database name
- -u, --username <username>¶
Required ArangoDB username
- -p, --password <password>¶
Required ArangoDB password
- -o, --output <output>¶
Required Output
- -f, --output-format <output_format>¶
Required The output format. Can be one of (‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘neo4j’, ‘arangodb’, ‘nt’, ‘jelly’, ‘null’, ‘sql’, ‘tsv’, ‘parquet’)
- --output-compression <output_compression>¶
The output compression type
- -s, --stream¶
Parse input as a stream
- -n, --node-filters <node_filters>¶
Filters for filtering nodes from the input graph
- -e, --edge-filters <edge_filters>¶
Filters for filtering edges from the input graph
- --node-collection <node_collection>¶
Name of a vertex collection (repeatable; default: nodes)
- --edge-collection <edge_collection>¶
Name of an edge collection (repeatable; default: edges)
- --all-collections¶
Discover and export all non-system collections in the database
arangodb-upload¶
Upload a set of nodes/edges to an ArangoDB database.
Parameters¶
- inputs: List[str]
A list of files that contains nodes/edges
- input_format: str
The input format
- input_compression: str
The input compression type
- uri: str
The full HTTP address for ArangoDB database
- database: str
The database name
- username: str
Username for authentication
- password: str
Password for authentication
- stream: bool
Whether to parse input as a stream
- node_filters: Tuple[str, str]
Node filters
- edge_filters: Tuple[str, str]
Edge filters
- node_collection: str
Name of the vertex collection
- edge_collection: str
Name of the edge collection
- curie_routing: bool
Whether to route to per-CURIE-prefix collections
Usage
kgx arangodb-upload [OPTIONS] INPUTS...
Options
- -i, --input-format <input_format>¶
Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘arangodb’, ‘duckdb’, ‘nt’, ‘jelly’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>¶
The input compression type
- -l, --uri <uri>¶
Required ArangoDB URI to upload to. For example, http://localhost:8529
- -d, --database <database>¶
Required ArangoDB database name
- -u, --username <username>¶
Required ArangoDB username
- -p, --password <password>¶
Required ArangoDB password
- -s, --stream¶
Parse input as a stream
- -n, --node-filters <node_filters>¶
Filters for filtering nodes from the input graph
- -e, --edge-filters <edge_filters>¶
Filters for filtering edges from the input graph
- --node-collection <node_collection>¶
Name of the vertex collection (default: nodes)
- --edge-collection <edge_collection>¶
Name of the edge collection (default: edges)
- --curie-routing¶
Route nodes/edges to per-CURIE-prefix collections (e.g., CL:1000300 -> collection CL)
Arguments
- INPUTS¶
Required argument(s)
graph-summary¶
Loads and summarizes a knowledge graph from a set of input files.
Parameters¶
- inputs: List[str]
Input file
- input_format: str
Input file format
- input_compression: Optional[str]
The input compression type
- output: Optional[str]
Where to write the output (stdout, by default)
- report_type: str
The summary get_errors type: “kgx-map” or “meta-knowledge-graph”
- report_format: Optional[str]
The summary get_errors format file types: ‘yaml’ or ‘json’ (default is report_type specific)
- graph_name: str
User specified name of graph being summarize
- node_facet_properties: Optional[List]
A list of node properties from which to generate counts per value for those properties. For example,
['provided_by']- edge_facet_properties: Optional[List]
A list of edge properties from which to generate counts per value for those properties. For example,
['original_knowledge_source', 'aggregator_knowledge_source']- error_log: str
Where to write any graph processing error message (stderr, by default, for empty argument)
Usage
kgx graph-summary [OPTIONS] INPUTS...
Options
- -i, --input-format <input_format>¶
Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘arangodb’, ‘duckdb’, ‘nt’, ‘jelly’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>¶
The input compression type
- -o, --output <output>¶
Required
- -r, --report-type <report_type>¶
The summary get_errors type. Must be one of (‘kgx-map’, ‘meta-knowledge-graph’)
- -f, --report-format <report_format>¶
The input format. Can be one of (‘yaml’, ‘json’)
- -n, --graph-name <graph_name>¶
User specified name of graph being summarized (default: ‘Graph’)
- --node-facet-properties <node_facet_properties>¶
A list of node properties from which to generate counts per value for those properties
- --edge-facet-properties <edge_facet_properties>¶
A list of edge properties from which to generate counts per value for those properties
- -l, --error-log <error_log>¶
File within which to get_errors graph data parsing errors (default: “stderr”)
Arguments
- INPUTS¶
Required argument(s)
merge¶
Load nodes and edges from files and KGs, as defined in a config YAML, and merge them into a single graph. The merged graph can then be written to a local/remote Neo4j instance OR be serialized into a file.
Note
Everything here is driven by the merge-config YAML.
Parameters¶
- merge_config: str
Merge config YAML
- source: List
A list of source to load from the YAML
- destination: List
A list of destination to write to, as defined in the YAML
- processes: int
Number of processes to use
Usage
kgx merge [OPTIONS]
Options
- --merge-config <merge_config>¶
Required
- --source <source>¶
Source(s) from the YAML to process
- --destination <destination>¶
Destination(s) from the YAML to process
- -p, --processes <processes>¶
Number of processes to use
neo4j-download¶
Download nodes and edges from Neo4j database.
Parameters¶
- uri: str
Neo4j URI. For example, https://localhost:7474
- username: str
Username for authentication
- password: str
Password for authentication
- output: str
Where to write the output (stdout, by default)
- output_format: str
The output type (
tsv, by default)- output_compression: str
The output compression type
- stream: bool
Whether to parse input as a stream
- node_filters: Tuple[str, str]
Node filters
- edge_filters: Tuple[str, str]
Edge filters
Usage
kgx neo4j-download [OPTIONS]
Options
- -l, --uri <uri>¶
Required Neo4j URI to download from. For example, https://localhost:7474
- -u, --username <username>¶
Required Neo4j username
- -p, --password <password>¶
Required Neo4j password
- -o, --output <output>¶
Required Output
- -f, --output-format <output_format>¶
Required The output format. Can be one of (‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘neo4j’, ‘arangodb’, ‘nt’, ‘jelly’, ‘null’, ‘sql’, ‘tsv’, ‘parquet’)
- -d, --output-compression <output_compression>¶
The output compression type
- -s, --stream¶
Parse input as a stream
- -n, --node-filters <node_filters>¶
Filters for filtering nodes from the input graph
- -e, --edge-filters <edge_filters>¶
Filters for filtering edges from the input graph
neo4j-upload¶
Upload a set of nodes/edges to a Neo4j database.
Parameters¶
- inputs: List[str]
A list of files that contains nodes/edges
- input_format: str
The input format
- input_compression: str
The input compression type
- uri: str
The full HTTP address for Neo4j database
- username: str
Username for authentication
- password: str
Password for authentication
- stream: bool
Whether to parse input as a stream
- node_filters: Tuple[str, str]
Node filters
- edge_filters: Tuple[str, str]
Edge filters
Usage
kgx neo4j-upload [OPTIONS] INPUTS...
Options
- -i, --input-format <input_format>¶
Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘arangodb’, ‘duckdb’, ‘nt’, ‘jelly’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>¶
The input compression type
- -l, --uri <uri>¶
Required Neo4j URI to upload to. For example, https://localhost:7474
- -u, --username <username>¶
Required Neo4j username
- -p, --password <password>¶
Required Neo4j password
- -s, --stream¶
Parse input as a stream
- -n, --node-filters <node_filters>¶
Filters for filtering nodes from the input graph
- -e, --edge-filters <edge_filters>¶
Filters for filtering edges from the input graph
Arguments
- INPUTS¶
Required argument(s)
transform¶
Transform a Knowledge Graph from one serialization form to another.
Parameters¶
- inputs: List[str]
A list of files that contains nodes/edges
- input_format: str
The input format
- input_compression: str
The input compression type
- output: str
The output file
- output_format: str
The output format
- output_compression: str
The output compression typ
- stream: bool
Whether or not to stream
- node_filters: Optional[List[Tuple[str, str]]]
Node input filters
- edge_filters: Optional[List[Tuple[str, str]]]
Edge input filters
- transform_config: str
Transform config YAML
- source: List
A list of source(s) to load from the YAML
- knowledge_sources: Optional[List[Tuple[str, str]]]
A list of named knowledge sources with (string, boolean or tuple rewrite) specification
- infores_catalog: Optional[str]
Optional dump of a TSV file of InfoRes CURIE to Knowledge Source mappings
- processes: int
Number of processes to use
Usage
kgx transform [OPTIONS] [INPUTS]...
Options
- -i, --input-format <input_format>¶
The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘arangodb’, ‘duckdb’, ‘nt’, ‘jelly’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>¶
The input compression type
- -o, --output <output>¶
Output
- -f, --output-format <output_format>¶
The output format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘arangodb’, ‘duckdb’, ‘nt’, ‘jelly’, ‘owl’, ‘sssom’, ‘parquet’)
- -d, --output-compression <output_compression>¶
The output compression type
- --stream¶
Parse input as a stream
- -n, --node-filters <node_filters>¶
Filters for filtering nodes from the input graph
- -e, --edge-filters <edge_filters>¶
Filters for filtering edges from the input graph
- --transform-config <transform_config>¶
Transform config YAML
- --source <source>¶
Source(s) from the YAML to process
- -k, --knowledge-sources <knowledge_sources>¶
A named knowledge source with (string, boolean or tuple rewrite) specification
- --infores-catalog <infores_catalog>¶
Optional dump of a CSV file of InfoRes CURIE to Knowledge Source mappings
- -p, --processes <processes>¶
Number of processes to use
Arguments
- INPUTS¶
Optional argument(s)
validate¶
Run KGX validator on an input file to check for Biolink Model compliance.
Parameters¶
- inputs: List[str]
Input files
- input_format: str
The input format
- input_compression: str
The input compression type
- output: str
Path to output file
- biolink_release: Optional[str]
SemVer version of Biolink Model Release used for validation (default: latest Biolink Model Toolkit version)
Usage
kgx validate [OPTIONS] INPUTS...
Options
- -i, --input-format <input_format>¶
Required The input format. Can be one of (‘tsv’, ‘csv’, ‘graph’, ‘json’, ‘jsonl’, ‘obojson’, ‘obo-json’, ‘trapi-json’, ‘neo4j’, ‘arangodb’, ‘duckdb’, ‘nt’, ‘jelly’, ‘owl’, ‘sssom’, ‘parquet’)
- -c, --input-compression <input_compression>¶
The input compression type
- -o, --output <output>¶
File to write validation reports to
- -b, --biolink-release <biolink_release>¶
Biolink Model Release (SemVer) used for validation (default: latest Biolink Model Toolkit version)
Arguments
- INPUTS¶
Required argument(s)