Skip to content

Class: IngestMetadataFile

Information about a particular ingest performed to produce a KGX graph, describing source data, ingest process, and details of the resulting KGX graph.

URI: biolink:IngestMetadataFile

classDiagram class IngestMetadataFile click IngestMetadataFile href "../IngestMetadataFile/" IngestMetadataFile : edge_predicates IngestMetadataFile : file_created_by IngestMetadataFile : file_creation_date IngestMetadataFile : file_name IngestMetadataFile : ingest_code_url IngestMetadataFile : ingest_code_version IngestMetadataFile : node_categories IngestMetadataFile : node_normalizer IngestMetadataFile : node_normalizer_url IngestMetadataFile : node_normalizer_version IngestMetadataFile : orphan_node_count IngestMetadataFile : source_access_date IngestMetadataFile : source_access_urls IngestMetadataFile : source_data_version IngestMetadataFile : source_file_names IngestMetadataFile : source_infores_id IngestMetadataFile : target_creation_date IngestMetadataFile : target_data_model_version IngestMetadataFile : target_data_url IngestMetadataFile : target_data_version IngestMetadataFile : target_format IngestMetadataFile : target_model IngestMetadataFile : target_model_url IngestMetadataFile : target_name IngestMetadataFile : total_edge_count IngestMetadataFile : total_node_count

Slots

Name Cardinality and Range Description Inheritance
file_name 1
String
An informative, human-readable name for this metadata file/object direct
file_created_by *
String
The agent(s) (person if hand-authored, software tool if created programmatica... direct
file_creation_date 1
Date
When this metadata file was created direct
ingest_code_url 1
Uriorcurie
URL of the specific official release or tagged branch of the ingest code exec... direct
ingest_code_version 1
String
The version of the ingest code executed to perform the ingest direct
source_infores_id 1
Uriorcurie
Infores identifier of the ingested source direct
source_data_version 1
String
Version of the source data ingested - using source's own conventions, or make... direct
source_access_date 1
Date
Date the source data was accessed/downloaded into the system that performed t... direct
source_access_urls *
Uriorcurie
URLs where source data was accessed / downloaded / queried to be brought into... direct
source_file_names *
String
File names from which content used to produce the output KGX graph was retrie... direct
target_name 1
String
A unique human readable name of the target data set/graph produced by this in... direct
target_creation_date 1
Date
Date the target data set/graph was created direct
target_data_url 1
Uriorcurie
URL where the dataset/graph can be retrieved direct
target_data_version 1
String
version of the target dataset/graph direct
target_format 1
String
Format in which the KGX graph is serialized (e direct
target_model 1
String
Name/identifier of the data model used to structure the KGX graph data direct
target_model_url 0..1
String
A URL providing information about the data model used to structure the KGX gr... direct
target_data_model_version 1
String
Version of the target data model used to structure the KGX graph data direct
node_normalizer 0..1
String
Name/identifier of the algorithm/tool used to perform node normalization on t... direct
node_normalizer_version 0..1
String
Version of the node normalization tool/algorithm direct
node_normalizer_url 0..1
String
URL(s) pointing to source code and/or information about the node normalizatio... direct
total_edge_count 0..1
Integer
Count of total edges in the graph direct
total_node_count 0..1
Integer
Count of total nodes in the graph direct
orphan_node_count 0..1
Integer
Count of nodes in the graph that do not participate in an edge direct
node_categories *
String
List of all Biolink categories used for nodes in the graph direct
edge_predicates *
String
List of all Biolink predicates used in the graph direct

Identifier and Mapping Information

Schema Source

  • from schema: https://w3id.org/biolink/kgx/ingest-metadata-schema

Mappings

Mapping Type Mapped Value
self biolink:IngestMetadataFile
native biolink:IngestMetadataFile

LinkML Source

Direct

name: IngestMetadataFile
description: Information about a particular ingest performed to produce a KGX graph,
  describing source data, ingest process, and details of the resulting KGX graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
attributes:
  file_name:
    name: file_name
    description: 'An informative, human-readable name for this metadata file/object.
      e.g. "2025-08-18 Translator CTD Ingest Metadata"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  file_created_by:
    name: file_created_by
    description: 'The agent(s) (person if hand-authored, software tool if created
      programmatically) that created this ingest metadata file.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    multivalued: true
  file_creation_date:
    name: file_creation_date
    description: When this metadata file was created.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: date
    required: true
  ingest_code_url:
    name: ingest_code_url
    description: 'URL of the specific official release or tagged branch of the ingest
      code executed to perform the ingest.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: uriorcurie
    required: true
  ingest_code_version:
    name: ingest_code_version
    description: 'The version of the ingest code executed to perform the ingest.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  source_infores_id:
    name: source_infores_id
    description: Infores identifier of the ingested source.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: uriorcurie
    required: true
  source_data_version:
    name: source_data_version
    description: 'Version of the source data ingested - using source''s own conventions,
      or make one up (e.g. use the date) if not provided by source.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  source_access_date:
    name: source_access_date
    description: 'Date the source data was accessed/downloaded into the system that
      performed the ingest.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: date
    required: true
  source_access_urls:
    name: source_access_urls
    description: 'URLs where source data was accessed / downloaded / queried to be
      brought into the system performing the ingest.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: uriorcurie
    multivalued: true
  source_file_names:
    name: source_file_names
    description: 'File names from which content used to produce the output KGX graph
      was retrieved.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    multivalued: true
  target_name:
    name: target_name
    description: 'A unique human readable name of the target data set/graph produced
      by this ingest. e.g. "2025-08-18 Translator CTD Ingest Graph"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  target_creation_date:
    name: target_creation_date
    description: Date the target data set/graph was created.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: date
    required: true
  target_data_url:
    name: target_data_url
    description: URL where the dataset/graph can be retrieved
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: uriorcurie
    required: true
  target_data_version:
    name: target_data_version
    description: version of the target dataset/graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  target_format:
    name: target_format
    description: 'Format in which the KGX graph is serialized (e.g., "KGX-jsonlines").

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  target_model:
    name: target_model
    description: 'Name/identifier of the data model used to structure the KGX graph
      data.

      '
    comments:
    - This will be Biolink Model for all Translator graphs - which can be referenced
      as "https://w3id.org/biolink/biolink-model"
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  target_model_url:
    name: target_model_url
    description: 'A URL providing information about the data model used to structure
      the KGX graph data. e.g. "https://biolink.github.io/biolink-model/"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
  target_data_model_version:
    name: target_data_model_version
    description: 'Version of the target data model used to structure the KGX graph
      data. e.g. "4.2.6-rc5"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  node_normalizer:
    name: node_normalizer
    description: 'Name/identifier of the algorithm/tool used to perform node normalization
      on the ingested data. e.g. "Babel"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
  node_normalizer_version:
    name: node_normalizer_version
    description: Version of the node normalization tool/algorithm.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
  node_normalizer_url:
    name: node_normalizer_url
    description: 'URL(s) pointing to source code and/or information about the node
      normalization tool used. e.g. "https://github.com/TranslatorSRI/NodeNormalization"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
  total_edge_count:
    name: total_edge_count
    description: Count of total edges in the graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: integer
  total_node_count:
    name: total_node_count
    description: Count of total nodes in the graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: integer
  orphan_node_count:
    name: orphan_node_count
    description: 'Count of nodes in the graph that do not participate in an edge.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: integer
  node_categories:
    name: node_categories
    description: List of all Biolink categories used for nodes in the graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    multivalued: true
  edge_predicates:
    name: edge_predicates
    description: List of all Biolink predicates used in the graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    domain_of:
    - IngestMetadataFile
    range: string
    multivalued: true

Induced

name: IngestMetadataFile
description: Information about a particular ingest performed to produce a KGX graph,
  describing source data, ingest process, and details of the resulting KGX graph.
from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
attributes:
  file_name:
    name: file_name
    description: 'An informative, human-readable name for this metadata file/object.
      e.g. "2025-08-18 Translator CTD Ingest Metadata"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: file_name
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  file_created_by:
    name: file_created_by
    description: 'The agent(s) (person if hand-authored, software tool if created
      programmatically) that created this ingest metadata file.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: file_created_by
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    multivalued: true
  file_creation_date:
    name: file_creation_date
    description: When this metadata file was created.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: file_creation_date
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: date
    required: true
  ingest_code_url:
    name: ingest_code_url
    description: 'URL of the specific official release or tagged branch of the ingest
      code executed to perform the ingest.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: ingest_code_url
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: uriorcurie
    required: true
  ingest_code_version:
    name: ingest_code_version
    description: 'The version of the ingest code executed to perform the ingest.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: ingest_code_version
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  source_infores_id:
    name: source_infores_id
    description: Infores identifier of the ingested source.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: source_infores_id
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: uriorcurie
    required: true
  source_data_version:
    name: source_data_version
    description: 'Version of the source data ingested - using source''s own conventions,
      or make one up (e.g. use the date) if not provided by source.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: source_data_version
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  source_access_date:
    name: source_access_date
    description: 'Date the source data was accessed/downloaded into the system that
      performed the ingest.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: source_access_date
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: date
    required: true
  source_access_urls:
    name: source_access_urls
    description: 'URLs where source data was accessed / downloaded / queried to be
      brought into the system performing the ingest.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: source_access_urls
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: uriorcurie
    multivalued: true
  source_file_names:
    name: source_file_names
    description: 'File names from which content used to produce the output KGX graph
      was retrieved.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: source_file_names
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    multivalued: true
  target_name:
    name: target_name
    description: 'A unique human readable name of the target data set/graph produced
      by this ingest. e.g. "2025-08-18 Translator CTD Ingest Graph"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: target_name
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  target_creation_date:
    name: target_creation_date
    description: Date the target data set/graph was created.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: target_creation_date
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: date
    required: true
  target_data_url:
    name: target_data_url
    description: URL where the dataset/graph can be retrieved
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: target_data_url
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: uriorcurie
    required: true
  target_data_version:
    name: target_data_version
    description: version of the target dataset/graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: target_data_version
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  target_format:
    name: target_format
    description: 'Format in which the KGX graph is serialized (e.g., "KGX-jsonlines").

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: target_format
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  target_model:
    name: target_model
    description: 'Name/identifier of the data model used to structure the KGX graph
      data.

      '
    comments:
    - This will be Biolink Model for all Translator graphs - which can be referenced
      as "https://w3id.org/biolink/biolink-model"
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: target_model
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  target_model_url:
    name: target_model_url
    description: 'A URL providing information about the data model used to structure
      the KGX graph data. e.g. "https://biolink.github.io/biolink-model/"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: target_model_url
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
  target_data_model_version:
    name: target_data_model_version
    description: 'Version of the target data model used to structure the KGX graph
      data. e.g. "4.2.6-rc5"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: target_data_model_version
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    required: true
  node_normalizer:
    name: node_normalizer
    description: 'Name/identifier of the algorithm/tool used to perform node normalization
      on the ingested data. e.g. "Babel"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: node_normalizer
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
  node_normalizer_version:
    name: node_normalizer_version
    description: Version of the node normalization tool/algorithm.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: node_normalizer_version
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
  node_normalizer_url:
    name: node_normalizer_url
    description: 'URL(s) pointing to source code and/or information about the node
      normalization tool used. e.g. "https://github.com/TranslatorSRI/NodeNormalization"

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: node_normalizer_url
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
  total_edge_count:
    name: total_edge_count
    description: Count of total edges in the graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: total_edge_count
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: integer
  total_node_count:
    name: total_node_count
    description: Count of total nodes in the graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: total_node_count
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: integer
  orphan_node_count:
    name: orphan_node_count
    description: 'Count of nodes in the graph that do not participate in an edge.

      '
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: orphan_node_count
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: integer
  node_categories:
    name: node_categories
    description: List of all Biolink categories used for nodes in the graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: node_categories
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    multivalued: true
  edge_predicates:
    name: edge_predicates
    description: List of all Biolink predicates used in the graph.
    from_schema: https://w3id.org/biolink/kgx/ingest-metadata-schema
    rank: 1000
    alias: edge_predicates
    owner: IngestMetadataFile
    domain_of:
    - IngestMetadataFile
    range: string
    multivalued: true