HemOncKB CC BY subset (doi:10.7910/DVN/9CY9C6)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Entire Codebook

Document Description

Citation

Title:

HemOncKB CC BY subset

Identification Number:

doi:10.7910/DVN/9CY9C6

Distributor:

Harvard Dataverse

Date of Distribution:

2021-02-09

Version:

18

Bibliographic Citation:

Warner MD, MS, Jeremy, 2021, "HemOncKB CC BY subset", https://doi.org/10.7910/DVN/9CY9C6, Harvard Dataverse, V18, UNF:6:uOCu1a+KvBr4S9YuwACrgw== [fileUNF]

Study Description

Citation

Title:

HemOncKB CC BY subset

Identification Number:

doi:10.7910/DVN/9CY9C6

Authoring Entity:

Warner MD, MS, Jeremy (HemOnc.org LLC)

Grant Number:

U24 CA265879

Distributor:

Harvard Dataverse

Access Authority:

Warner MD, MS, Jeremy

Depositor:

Warner MD, MS, Jeremy

Date of Deposit:

2021-02-09

Holdings Information:

https://doi.org/10.7910/DVN/9CY9C6

Study Scope

Keywords:

Medicine, Health and Life Sciences, Neoplasms, Controlled vocabulary

Abstract:

This dataset is a subset of the full HemOncKB centered on the Component concept and its relationships and hierarchies. It also includes all Conditions from the HemOncKB. Components are mapped to ATC, HCPCS, NDC, and RxNorm/RxNorm Extension codes and Conditions are mapped to ICD-9-CM, ICD-10-CM, ICD-O-3 histology, ICD-O-3 morphology, NCIT, OncoTree, and SEER Site recodes. HemOnc follows the OMOP Common Data Model format and specifications.

Methodology and Processing

Sources Statement

Data Access

Citation Requirement:

If you use the HemOncKB in articles or publications that are based on analysis of this data, please cite Warner JL, Dymshyts D, Reich CG, Gurley MJ, Hochheiser H, Moldwin ZH, Belenkaya R, Williams AE, Yang PC. HemOnc: A new standard vocabulary for chemotherapy regimen representation in the OMOP common data model. J Biomed Inform. 2019 Aug;96:103239. Epub 2019 Jun 22. PMID: 31238109; PMCID: PMC6697579. doi: 10.1016/j.jbi.2019.103239

Deposit Requirement:

There are no requirements but we appreciate if you could share a copy of any manuscript using the HemOncKB.

Disclaimer:

The HemOncKB is primarily a derivative of the HemOnc.org website. HemOnc.org is a reference to be used for educational purposes only by appropriately trained medical professionals. This site is intended to give providers the ability to quickly refer to the primary literature, review notes from themselves and their peers, and to share useful resources. This site may not be used as a substitute for formal medical training, may never be used as a substitute for independent clinical judgment, and in no way should be construed as offering medical advice.

Notes:

This dataset is licensed under the CreativeCommons Attribution 4.0 International License (CC BY 4.0). Under this license, users are free to: Share — copy and redistribute the material in any medium or format Adapt — remix, transform, and build upon the material for any purpose, even commercially. Under the following terms: Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Other Study Description Materials

Related Publications

Citation

Title:

Warner JL, Dymshyts D, Reich CG, Gurley MJ, Hochheiser H, Moldwin ZH, Belenkaya R, Williams AE, Yang PC. HemOnc: A new standard vocabulary for chemotherapy regimen representation in the OMOP common data model. J Biomed Inform. 2019 Aug;96:103239. Epub 2019 Jun 22. PMID: 31238109; PMCID: PMC6697579.

Identification Number:

10.1016/j.jbi.2019.103239

Bibliographic Citation:

Warner JL, Dymshyts D, Reich CG, Gurley MJ, Hochheiser H, Moldwin ZH, Belenkaya R, Williams AE, Yang PC. HemOnc: A new standard vocabulary for chemotherapy regimen representation in the OMOP common data model. J Biomed Inform. 2019 Aug;96:103239. Epub 2019 Jun 22. PMID: 31238109; PMCID: PMC6697579.

File Description--f11680547

File: 2025-06-28.ccby_concepts.tab

  • Number of cases: 6877

  • No. of variables per record: 7

  • Type of File: text/tab-separated-values

Notes:

UNF:6:19+opzBCdbmgaT3ijlsjnQ==

File Description--f11680548

File: 2025-06-28.ccby_rels.tab

  • Number of cases: 32418

  • No. of variables per record: 5

  • Type of File: text/tab-separated-values

Notes:

UNF:6:m8AiRZkB7KlFPRi1Ki/7Gg==

File Description--f11680546

File: 2025-06-28.ccby_synonyms.tab

  • Number of cases: 9149

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:HCN7KIUc0p8xw9MAlpeL8Q==

Variable Description

List of Variables:

Variables

concept_name

f11680547 Location:

Variable Format: character

Notes: UNF:6:7p9x65L84PgQOmCcSNP3bA==

vocabulary_id

f11680547 Location:

Variable Format: character

Notes: UNF:6:hlorq/JKeqBi4XhVFJAqNQ==

concept_class_id

f11680547 Location:

Variable Format: character

Notes: UNF:6:vyNtw6grU4UuPvyl7LJbqg==

concept_code

f11680547 Location:

Summary Statistics: Min. 1.0; Valid 6877.0; Max. 159729.0; Mean 53228.202268429464; StDev 46763.32859169911

Variable Format: numeric

Notes: UNF:6:i0Wo2IUCoMXRonmlzy0R5Q==

valid_start_date

f11680547 Location:

Variable Format: character

Notes: UNF:6:04aIcju3yAi+uaFtF+40sw==

valid_end_date

f11680547 Location:

Variable Format: character

Notes: UNF:6:tfoH+AQCbuWGxQrBHzdKTA==

invalid_reason

f11680547 Location:

Variable Format: character

Notes: UNF:6:9EnzLqypA94DgP7WlK+ymA==

concept_code_1

f11680548 Location:

Summary Statistics: StDev 30356.465296943352; Mean 12126.293263004523; Valid 32418.0; Min. 1.0; Max. 159149.0;

Variable Format: numeric

Notes: UNF:6:URIpZzQHQuETdiBYn53rIA==

concept_code_2

f11680548 Location:

Variable Format: character

Notes: UNF:6:JhcYZAMaiBb52cAepX1ZBQ==

vocabulary_id_1

f11680548 Location:

Variable Format: character

Notes: UNF:6:jSePwgg/0Ek6JvLwyzIaIQ==

vocabulary_id_2

f11680548 Location:

Variable Format: character

Notes: UNF:6:d57T1q+I2tUceN248XUTmA==

relationship_id

f11680548 Location:

Variable Format: character

Notes: UNF:6:1n/g3nOCUi05WiXXZ7chlg==

synonym_name

f11680546 Location:

Variable Format: character

Notes: UNF:6:CAdwpaZCaUTDlBi2eqFsdA==

synonym_concept_code

f11680546 Location:

Summary Statistics: Mean 46555.08875287212; StDev 48113.973900216275; Min. 1.0; Valid 9149.0; Max. 159155.0

Variable Format: numeric

Notes: UNF:6:oS1dSaUyDDLCxL62fO8icw==

synonym_vocabulary_id

f11680546 Location:

Variable Format: character

Notes: UNF:6:kL2ZdIi7lOjpbpsSVwcLGw==