additional_notes |
None |
--- |
--- |
agent_type |
The agent type (or types) relevant to this type of edge. Multivalued only if instances of this type of edge can have different agent types in the data. |
--- |
--- |
artifacts |
Links to and descriptions of external artifacts related to the provenance of the ingest, such as Github issues, surveys of prior ingests of the source, etc. |
--- |
--- |
category |
The category of content described by a given consideration, based on how the content will be represented in the target graph (e.g. "Edges", 'Node Properties", "Edge Properties"). |
--- |
--- |
citations |
One ore more citations to publications describing the source. May be a identifier (e.g a pmid or doi), a url to the published document, or a free-text citation. |
--- |
--- |
consideration |
A description of what additional content should be considered and why. |
--- |
--- |
contributions |
The name of a person making a contribution, and the type of contribution made. e.g. "code author", "code support", "data modeling", "domain expertise". |
--- |
--- |
data_access_locations |
Where the source data that is being ingested can be accessed. Provide one or more URLs, along with optional descriptions of what each URL provides. |
--- |
--- |
data_formats |
The format(s) in which the data is serialized for retrieval and use. |
--- |
--- |
data_provision_mechanisms |
How the source distributes their data (file download, API endpoints, database dump). |
--- |
--- |
data_versioning_and_releases |
A description of how releases are versioned and managed by the source (e.g. general approach, frequency, other important considerations). May also include links to web pages describing such information. |
--- |
--- |
description |
A brief description of the source, including its purpose, scope, and any relevant background information. |
--- |
--- |
edge_properties |
A list of one or more Biolink edge properties used in instances of this edge type in the data. |
--- |
--- |
edge_type_info |
A description of each type of edge (metaedge) created in the target knowledge graph by this ingest, including what types of edge properties it holds, and a brief explanation of why this modeling pattern was deemed appropriate to represent the source data (to be displayed for end users in a UI system). |
--- |
--- |
fields_used |
Optional list of the specific source fields that are part of or inform the ingest. |
--- |
--- |
file_name |
The name of the relevant file (or endpoint, or table). |
--- |
--- |
filtered_content |
A description of what types of records from each relevant file are not included in the ingest, and the rationale for any filtering rules or exclusion criteria. Only list a file if some but not all records it contains are included in the ingest - to document what subset was excluded, and why. |
--- |
--- |
filtered_records |
A description of what types of records were excluded from the ingest, in terms of filtering rules or exclusion criteria. |
--- |
--- |
future_considerations |
Notes about content additions or changes to consider in future iterations of this ingest. Separately consider content that will be represented as Edges vs Node Properties vs Edge Properties in the target knowledge graph. |
--- |
--- |
included_content |
A description of what types of records from relevant files/endpoints/tables above are included in this ingest, and optionally a list of fields from these records that are part of the ingest or used to inform it. |
--- |
--- |
included_records |
A description of the types of records that are included in the ingest. |
--- |
--- |
infores_id |
The infores identifier of the source from which content is being ingested, e.g. "infores:ctd". |
--- |
--- |
ingest_categories |
A term or terms indicating the type of source being ingested, from the perspective of the ingesting system (e.g. primary knowledge provider, supporting data provider, ontology/terminology provider). |
--- |
--- |
ingest_info |
Information about the rationale and scope of an ingest, including what source content was included and excluded from the ingest, and what additional content might be considered in future iterations. |
--- |
--- |
knowledge_level |
The knowledge level (or levels) relevant to this type of edge. Multivalued only if instances of this type of edge can have different knowledge levels in the data. |
--- |
--- |
license_name |
The name of an established license used by the source (e.g. "CC BY 4.0") |
--- |
--- |
license_url |
The url of an established license (e.g. "https://creativecommons.org/licenses/by/4.0/") |
--- |
--- |
location |
The URL of a web page or ftp site where the indicated file (or endpoint or table) was accessed. |
--- |
--- |
name |
A human readable name for the RIG. |
--- |
--- |
node_category |
The high-level Biolink category of nodes as assumed or assigned by ingestors. e.g. "biolink:Gene". Note that downstream normalization of node identifiers may result in new/different categories ultimately being assigned in the final graph. |
--- |
--- |
node_properties |
A list of one or more Biolink node properties used in instances of this node type in the data. |
--- |
--- |
node_type_info |
A description of each type of node created in the target knowledge graph by this ingest, in terms of the high-level Biolink categor(ies) of nodes as assumed or assigned by ingestors. Note however that downstream normalization of node identifiers may result in new/different categories ultimately being assigned in the final graph. |
--- |
--- |
object_categories |
The Biolink category of the object node of this edge type. e.g. "biolink:Disease". If two edge types differ only in their object category, but use the same predicate, subject_category, edge properties, and general provenance, they can be described together in a single NodeType object that captures the alternative object categories. e.g. if a source provides Gene-associated_with-Disease and Gene-associated_with-PhenotypicFeature edge types, these can be described in a single EdgeType object with two subject categories (Disease and PhenotypicFeature) |
--- |
--- |
predicate |
The Biolink predicate that defines this type of edge (e.g. "biolink:treats) |
--- |
--- |
property |
The Biolink qualifier slot that defines the kind of Qualifier specified, e.g. "biolink:subject_aspect_qualifier", "qualified_predicate". |
--- |
--- |
provenance_info |
Information about the provenance of the ingest, including who contributed and how, and links to external provenance-related artifacts (e.g. Github tickets, ingest surveys, etc.) |
--- |
--- |
qualifiers |
If relevant, report any qualifiers applied to the edge type, as a Qualifier object that contains a qualifier_property and qualifier_range pair. e.g. the property "biolink:subject_aspect_qualifier", and range "biolink:GeneOrGeneProductOrChemicalEntityAspectEnum |
--- |
--- |
rationale |
The rationale for excluding the indicated content (why this subset of records was filtered out). |
--- |
--- |
relevant_files |
A description of each source file (or API endpoint, database, or table) that contains data used to create the ingested knowledge. Source files that dontain data not used to created knowledge need not be listed or described. |
--- |
--- |
scope |
A short, high-level narrative describing of the types of knowledge form the source that are included and excluded in this ingest. |
--- |
--- |
source_identifier_types |
The type of identifier(s) used for this category of entity by the source system. Report as a prefix for an identifier system where appropriate/possible (preferably a prefix as cataloged in the Biolink prefix map here: https://github.com/biolink/biolink-model/blob/master/project/prefixmap/biolink-model-prefix-map.json). e.g. "MESH", "CTD", "ECTO". If prefix for a public system/database is not in the prefix map, you may make a PR to add it. If the identifiers used are bespoke, or no identifiers are used, the value can be a free text description. e.g. "The source uses entity names but does not assign identifiers". |
--- |
--- |
source_info |
Information about the source from which content is ingested. |
--- |
--- |
subject_categories |
The Biolink category of the subject node of this edge type. e.g. "biolink:SmallMolecule". If two edge types differ only in their subject category, but use the same predicate, object_category, edge properties, and general provenance, they can be described together in a single NodeType object that captures the alternative subject categories. e.g. if a source provides SmallMolecule-treats-Disease and MolecularMixture-treats-Disease edge types, these can be described in a single EdgeType object with two subject categories (SmallMolecule and MolecularMixture). |
--- |
--- |
supporting_data_source_info |
Information about upstream sources of data that are used by an ingested source, to derive the knowledge that we ingest. |
--- |
--- |
target_info |
Information about the dataset / knowledge graph output by the ingest, including what types of edges and nodes were produced, modeling rationale, and what modeling changes might be considered in future iterations. |
--- |
--- |
terms_of_use_description |
A free text description of the terms of use for a source. (e.g. "Source only indicates 'all rights reserved' in their documentation") |
--- |
--- |
terms_of_use_info |
Information about conditions for use of the ingested source. May include the name of a community license (e.g. CC-BY 4.0 ), a link to a "terms of use" or license information web page (e.g. https://ctdbase.org/about/legal.jsp ), and/or a free-text summary of key terms of use. |
--- |
--- |
terms_of_use_url |
The url of a document or web page where a source describes its terms of use, and/or references a community license that it adopts. (e.g. "https://ctdbase.org/about/legal.jsp") |
--- |
--- |
ui_explanation |
A brief explanation of why this modeling pattern was deemed appropriate to represent the source data (for display to end users in a UI system). |
--- |
--- |
utility |
Brief description of why the source was ingested, and the utility of the data it provides for target system use cases. |
--- |
--- |
value_description |
A free text description of the tyeps of value allowed for the qualifier. |
--- |
--- |
value_enumeration |
A set of one or more specific values for the qualifier in an Edge type (e.g. ["biolink:causes"] as the only value for the "biolink:qualified_predicate" qualifier property, ["activity_or_abundance", "activity", "abundance"] as the values for the "biolink:object_aspect_qualifier" property). |
--- |
--- |
value_id_prefixes |
One or more id prefixes from which the qualifier value mush come. e.g. "HP" if the qualifier must be a Human Phenotype Ontology term. |
--- |
--- |
value_range |
The Biolink class(es) or type(s) that specifies the kind of calue the qualifier property takes, Reported as the name of a Biolink class, enumeration, or data type, as appropriate. e.g. "biolink:Disease", "biolink:GeneOrGeneProductOrChemicalEntityAspectEnum", "biolink:string" |