Class: IngestMetadataFile
Information about a particular ingest performed to produce a KGX graph, describing source data, ingest process, and details of the resulting KGX graph.
URI: biolink:IngestMetadataFile
classDiagram
class IngestMetadataFile
click IngestMetadataFile href "../IngestMetadataFile/"
IngestMetadataFile : edge_predicates
IngestMetadataFile : file_created_by
IngestMetadataFile : file_creation_date
IngestMetadataFile : file_name
IngestMetadataFile : ingest_code_url
IngestMetadataFile : ingest_code_version
IngestMetadataFile : node_categories
IngestMetadataFile : node_normalizer
IngestMetadataFile : node_normalizer_url
IngestMetadataFile : node_normalizer_version
IngestMetadataFile : orphan_node_count
IngestMetadataFile : source_access_date
IngestMetadataFile : source_access_urls
IngestMetadataFile : source_data_version
IngestMetadataFile : source_file_names
IngestMetadataFile : source_infores_id
IngestMetadataFile : target_creation_date
IngestMetadataFile : target_data_model_version
IngestMetadataFile : target_data_url
IngestMetadataFile : target_data_version
IngestMetadataFile : target_format
IngestMetadataFile : target_model
IngestMetadataFile : target_model_url
IngestMetadataFile : target_name
IngestMetadataFile : total_edge_count
IngestMetadataFile : total_node_count
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
file_name | 1 String |
An informative, human-readable name for this metadata file/object | direct |
file_created_by | * String |
The agent(s) (person if hand-authored, software tool if created programmatica... | direct |
file_creation_date | 1 Date |
When this metadata file was created | direct |
ingest_code_url | 1 Uriorcurie |
URL of the specific official release or tagged branch of the ingest code exec... | direct |
ingest_code_version | 1 String |
The version of the ingest code executed to perform the ingest | direct |
source_infores_id | 1 Uriorcurie |
Infores identifier of the ingested source | direct |
source_data_version | 1 String |
Version of the source data ingested - using source's own conventions, or make... | direct |
source_access_date | 1 Date |
Date the source data was accessed/downloaded into the system that performed t... | direct |
source_access_urls | * Uriorcurie |
URLs where source data was accessed / downloaded / queried to be brought into... | direct |
source_file_names | * String |
File names from which content used to produce the output KGX graph was retrie... | direct |
target_name | 1 String |
A unique human readable name of the target data set/graph produced by this in... | direct |
target_creation_date | 1 Date |
Date the target data set/graph was created | direct |
target_data_url | 1 Uriorcurie |
URL where the dataset/graph can be retrieved | direct |
target_data_version | 1 String |
version of the target dataset/graph | direct |
target_format | 1 String |
Format in which the KGX graph is serialized (e | direct |
target_model | 1 String |
Name/identifier of the data model used to structure the KGX graph data | direct |
target_model_url | 0..1 String |
A URL providing information about the data model used to structure the KGX gr... | direct |
target_data_model_version | 1 String |
Version of the target data model used to structure the KGX graph data | direct |
node_normalizer | 0..1 String |
Name/identifier of the algorithm/tool used to perform node normalization on t... | direct |
node_normalizer_version | 0..1 String |
Version of the node normalization tool/algorithm | direct |
node_normalizer_url | 0..1 String |
URL(s) pointing to source code and/or information about the node normalizatio... | direct |
total_edge_count | 0..1 Integer |
Count of total edges in the graph | direct |
total_node_count | 0..1 Integer |
Count of total nodes in the graph | direct |
orphan_node_count | 0..1 Integer |
Count of nodes in the graph that do not participate in an edge | direct |
node_categories | * String |
List of all Biolink categories used for nodes in the graph | direct |
edge_predicates | * String |
List of all Biolink predicates used in the graph | direct |
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
Mappings
Mapping Type | Mapped Value |
---|---|
self | biolink:IngestMetadataFile |
native | biolink:IngestMetadataFile |
LinkML Source
Direct
name: IngestMetadataFile
description: Information about a particular ingest performed to produce a KGX graph,
describing source data, ingest process, and details of the resulting KGX graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
attributes:
file_name:
name: file_name
description: 'An informative, human-readable name for this metadata file/object.
e.g. "2025-08-18 Translator CTD Ingest Metadata"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
required: true
file_created_by:
name: file_created_by
description: 'The agent(s) (person if hand-authored, software tool if created
programmatically) that created this ingest metadata file.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
multivalued: true
file_creation_date:
name: file_creation_date
description: When this metadata file was created.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: date
required: true
ingest_code_url:
name: ingest_code_url
description: 'URL of the specific official release or tagged branch of the ingest
code executed to perform the ingest.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: uriorcurie
required: true
ingest_code_version:
name: ingest_code_version
description: 'The version of the ingest code executed to perform the ingest.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
required: true
source_infores_id:
name: source_infores_id
description: Infores identifier of the ingested source.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: uriorcurie
required: true
source_data_version:
name: source_data_version
description: 'Version of the source data ingested - using source''s own conventions,
or make one up (e.g. use the date) if not provided by source.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
required: true
source_access_date:
name: source_access_date
description: 'Date the source data was accessed/downloaded into the system that
performed the ingest.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: date
required: true
source_access_urls:
name: source_access_urls
description: 'URLs where source data was accessed / downloaded / queried to be
brought into the system performing the ingest.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: uriorcurie
multivalued: true
source_file_names:
name: source_file_names
description: 'File names from which content used to produce the output KGX graph
was retrieved.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
multivalued: true
target_name:
name: target_name
description: 'A unique human readable name of the target data set/graph produced
by this ingest. e.g. "2025-08-18 Translator CTD Ingest Graph"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
required: true
target_creation_date:
name: target_creation_date
description: Date the target data set/graph was created.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: date
required: true
target_data_url:
name: target_data_url
description: URL where the dataset/graph can be retrieved
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: uriorcurie
required: true
target_data_version:
name: target_data_version
description: version of the target dataset/graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
required: true
target_format:
name: target_format
description: 'Format in which the KGX graph is serialized (e.g., "KGX-jsonlines").
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
required: true
target_model:
name: target_model
description: 'Name/identifier of the data model used to structure the KGX graph
data.
'
comments:
- This will be Biolink Model for all Translator graphs - which can be referenced
as "https://w3id.org/biolink/biolink-model"
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
required: true
target_model_url:
name: target_model_url
description: 'A URL providing information about the data model used to structure
the KGX graph data. e.g. "https://biolink.github.io/biolink-model/"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
target_data_model_version:
name: target_data_model_version
description: 'Version of the target data model used to structure the KGX graph
data. e.g. "4.2.6-rc5"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
required: true
node_normalizer:
name: node_normalizer
description: 'Name/identifier of the algorithm/tool used to perform node normalization
on the ingested data. e.g. "Babel"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
node_normalizer_version:
name: node_normalizer_version
description: Version of the node normalization tool/algorithm.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
node_normalizer_url:
name: node_normalizer_url
description: 'URL(s) pointing to source code and/or information about the node
normalization tool used. e.g. "https://github.com/TranslatorSRI/NodeNormalization"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
total_edge_count:
name: total_edge_count
description: Count of total edges in the graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: integer
total_node_count:
name: total_node_count
description: Count of total nodes in the graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: integer
orphan_node_count:
name: orphan_node_count
description: 'Count of nodes in the graph that do not participate in an edge.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: integer
node_categories:
name: node_categories
description: List of all Biolink categories used for nodes in the graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
multivalued: true
edge_predicates:
name: edge_predicates
description: List of all Biolink predicates used in the graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
domain_of:
- IngestMetadataFile
range: string
multivalued: true
Induced
name: IngestMetadataFile
description: Information about a particular ingest performed to produce a KGX graph,
describing source data, ingest process, and details of the resulting KGX graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
attributes:
file_name:
name: file_name
description: 'An informative, human-readable name for this metadata file/object.
e.g. "2025-08-18 Translator CTD Ingest Metadata"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: file_name
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
required: true
file_created_by:
name: file_created_by
description: 'The agent(s) (person if hand-authored, software tool if created
programmatically) that created this ingest metadata file.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: file_created_by
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
multivalued: true
file_creation_date:
name: file_creation_date
description: When this metadata file was created.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: file_creation_date
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: date
required: true
ingest_code_url:
name: ingest_code_url
description: 'URL of the specific official release or tagged branch of the ingest
code executed to perform the ingest.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: ingest_code_url
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: uriorcurie
required: true
ingest_code_version:
name: ingest_code_version
description: 'The version of the ingest code executed to perform the ingest.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: ingest_code_version
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
required: true
source_infores_id:
name: source_infores_id
description: Infores identifier of the ingested source.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: source_infores_id
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: uriorcurie
required: true
source_data_version:
name: source_data_version
description: 'Version of the source data ingested - using source''s own conventions,
or make one up (e.g. use the date) if not provided by source.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: source_data_version
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
required: true
source_access_date:
name: source_access_date
description: 'Date the source data was accessed/downloaded into the system that
performed the ingest.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: source_access_date
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: date
required: true
source_access_urls:
name: source_access_urls
description: 'URLs where source data was accessed / downloaded / queried to be
brought into the system performing the ingest.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: source_access_urls
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: uriorcurie
multivalued: true
source_file_names:
name: source_file_names
description: 'File names from which content used to produce the output KGX graph
was retrieved.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: source_file_names
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
multivalued: true
target_name:
name: target_name
description: 'A unique human readable name of the target data set/graph produced
by this ingest. e.g. "2025-08-18 Translator CTD Ingest Graph"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: target_name
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
required: true
target_creation_date:
name: target_creation_date
description: Date the target data set/graph was created.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: target_creation_date
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: date
required: true
target_data_url:
name: target_data_url
description: URL where the dataset/graph can be retrieved
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: target_data_url
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: uriorcurie
required: true
target_data_version:
name: target_data_version
description: version of the target dataset/graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: target_data_version
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
required: true
target_format:
name: target_format
description: 'Format in which the KGX graph is serialized (e.g., "KGX-jsonlines").
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: target_format
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
required: true
target_model:
name: target_model
description: 'Name/identifier of the data model used to structure the KGX graph
data.
'
comments:
- This will be Biolink Model for all Translator graphs - which can be referenced
as "https://w3id.org/biolink/biolink-model"
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: target_model
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
required: true
target_model_url:
name: target_model_url
description: 'A URL providing information about the data model used to structure
the KGX graph data. e.g. "https://biolink.github.io/biolink-model/"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: target_model_url
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
target_data_model_version:
name: target_data_model_version
description: 'Version of the target data model used to structure the KGX graph
data. e.g. "4.2.6-rc5"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: target_data_model_version
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
required: true
node_normalizer:
name: node_normalizer
description: 'Name/identifier of the algorithm/tool used to perform node normalization
on the ingested data. e.g. "Babel"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: node_normalizer
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
node_normalizer_version:
name: node_normalizer_version
description: Version of the node normalization tool/algorithm.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: node_normalizer_version
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
node_normalizer_url:
name: node_normalizer_url
description: 'URL(s) pointing to source code and/or information about the node
normalization tool used. e.g. "https://github.com/TranslatorSRI/NodeNormalization"
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: node_normalizer_url
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
total_edge_count:
name: total_edge_count
description: Count of total edges in the graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: total_edge_count
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: integer
total_node_count:
name: total_node_count
description: Count of total nodes in the graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: total_node_count
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: integer
orphan_node_count:
name: orphan_node_count
description: 'Count of nodes in the graph that do not participate in an edge.
'
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: orphan_node_count
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: integer
node_categories:
name: node_categories
description: List of all Biolink categories used for nodes in the graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: node_categories
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
multivalued: true
edge_predicates:
name: edge_predicates
description: List of all Biolink predicates used in the graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
rank: 1000
alias: edge_predicates
owner: IngestMetadataFile
domain_of:
- IngestMetadataFile
range: string
multivalued: true