Every two weeks or so, I wanted to highlight a medical dataset that is semi-publicly available. I say semi because some medical datasets (especially the most useful ones) require some sort of data use agreement. These agreements typically require a project statement, a signed data use agreement that you won't be anything nefarious or try to de-identify people in the dataset, and optionally but recommended human subjects training through freely available resources like CITI.
These datasets are part of a lecture I give to my students about data sources. Search the tag "medical datasets" to get a list of all blog posts.
For each dataset, we will highlight basic information.
- name of the dataset.
- author.
- short description of purpose.
- number of rows.
- number of features.
- general description of features.
- data format (csv, sas, etc).
- url link to data.
- url to data dictionary.
- one or two links to papers that use the dataset.