ICPSR Metadata Schema

Last updated: May 12, 2026

This metadata schema is used to describe data collections at the Inter-university Consortium for Political and Social Research (ICPSR) after 2026.

These rules and definitions document ICPSR's metadata practices and are intended to (a) assist ICPSR staff with metadata entry, and (b) help users – including data depositors and researchers – understand and interpret ICPSR metadata.

Machine-actionable copies of metadata field definitions are also available in JSON Schema format.

Metadata Elements: Overview

Property Required? Repeatable? Accepted Values Description
Title Yes No Text The official title that describes what the data collection is about, its geographic scope, and the time period it covered.
Alternate Titles No Yes Text The alternate name(s) or acronym(s) commonly used to refer to the data collection.
Principal Investigators Yes Yes Multi-part element; see subfields The key people or organizations responsible for the data collection, listed by importance. Each data collection requires at least one PI, either a person or an organization.
Funding Sources No Yes Multi-part element; see subfields The sources of funding that supported the data collection.
Summary Yes No Text A description of the data collection that helps users understand its purpose, substance, and key topics.
ICPSR Subject Terms Yes Yes Multi-part element; see subfields A controlled list of social science terms maintained by ICPSR and used to indicate topics related to the data collection.
Journal of Economic Literature (JEL) Classification Codes No Yes Multi-part element; see subfields Classification codes used to categorize economic research.
Medical Subject Headings (MeSH) Terms No Yes Multi-part element; see subfields Biomedical and health-related terms from the National Library of Medicine that describe the data collection's topics.
Time Periods Yes Yes Multi-part element; see subfields The time period(s) to which the data refer, regardless of when the data were collected.
Nationally Representative Sample No No Text Indicates whether the data collection uses a sampling design intended to represent the demographics, behaviors, and/or characteristics of the entire nation. This typically involves probability-based methods that allow generalization. It does not include convenience samples that appear similar to the nation by chance.
Geographic Coverage Areas Yes Yes Multi-part element; see subfields The geographic locations where the data refer or are related.
Smallest Geographic Unit No No Multi-part element; see subfields The smallest geographic unit (e.g., state or census tract) used in the dataset.
Study Design No No Text The procedures used to contact participants and gather data.
Universe No No Text The total group of persons or other entities (e.g., households or organizations) that were the object of research and to which analytic results refer.
Time Methods No Yes Multi-part element; see subfields The methods used to collect data over time, like snapshots at one point (cross-sectional) or repeatedly (longitudinal) to study changes or trends.
Units of Analysis No Yes Multi-part element; see subfields The object(s) of analysis for the data collection, such as an organization, individual, or household.
Sampling Procedures No Yes Text The type(s) of sample and sample design used to select survey respondents to represent the population.
Sampling Note No No Text Supplemental information about the sampling process that does not fit neatly into the Sampling Procedure field.
Weights No No Text The weight variables and the criteria for using them in data analysis, or other information about how the data are weighted if no weight variables are present.
Response Rates No No Text The percentage of respondents in the sample who participated in the data collection.
Data Source Types No Yes Multi-part element; see subfields The source(s) of the data as collected by the Principal Investigators.
External Data Sources No Yes Text The source of the data, when that source is external to the data collection and can be independently cited.
Collection Modes No Yes Multi-part element; see subfields The method(s) or procedure(s) used to collect the data, such as an interview or experiment.
Collection Dates No Yes Multi-part element; see subfields The date(s) data collection took place.
Variable Description No No Text Significant variables (particularly demographic variables) in the data files.
Scales No No Text Any commonly known scales, measures, or inventories used in the data collection.
Data Management Plan No No Text A link to the data management plan (preferably a persistent identifier such as a DOI).
Preregistration No No Text A link to a research plan for the data collection (preferably a persistent identifier such as a DOI).
Software Applications No Yes Multi-part element; see subfields Software used by the principal investigator(s) to collect or analyze data, required to understand how the data were obtained or to reproduce results.
General Data Formats No Yes Multi-part element; see subfields The file format types present in the data collection.
Notes No Yes Text Important details about the data collection (like unique authoring, discrepancies, or processing information) that can't be recorded in other metadata elements.
Manuscript Number No No Text A unique identifier that associates the data collection with a manuscript submitted to a journal.
ADA Accessibility No No Multi-part element; see subfields Indicates whether the data collection is ADA accessible, conforming to WCAG 2.1 AA standards, or qualifies for the ADA archival exception.
License No No Multi-part element; see subfields A license governing the data's use.
Version History No Yes Multi-part element; see subfields A record of how the data collection has changed over time.
Distributors No Yes Multi-part element; see subfields The organization(s) responsible for distributing the data collection.
Study Number Yes No Number A unique, numerical value used by ICPSR to identify and track data collections.
Digital Object Identifier (DOI) Yes No Text The registered persistent digital object identifier (DOI) associated with the data collection.
Citation No No Text The official way to reference the data collection in writing.
Person No No Multi-part element; see subfields A person associated with an ICPSR data collection or service.
Organization No No Multi-part element; see subfields An organization associated with an ICPSR data collection or service.

Key for ICPSR Metadata Schema Entries

Full information for each ICPSR study metadata element includes the following fields:

  • Description: A short description of the metadata element and the information it is intended to convey.
  • Required: Indicates whether the metadata element is mandatory ("Yes") or optional ("No"). Required elements must include at least one value.
  • Repeatable: Indicates whether the metadata element may be repeated ("Yes") or if it may only occur once ("No").
  • Accepted values: The type of values that may be used with the metadata element; options include text (with additional requirements, such as date formatting, noted when present) and numbers. Multi-part metadata elements have accepted value information provided in entries for individual subelements.
  • Usage Notes: Additional information about the nature, scope, and conventions for values that may be added to the metadata element.
  • Examples: Examples of valid values for the metadata element.

Metadata Elements: Detailed Information

Title

Description: The official title that describes what the data collection is about, its geographic scope, and the time period it covered.

Required: Yes

Repeatable: No

Accepted Values: Text

Usage Notes: The Title includes three essential parts: the title proper, the geography, and the time period.

Title Proper:

  • The title proper is a descriptive string that captures what the data collection contains.

  • The title proper uses title case: all major words are capitalized, while minor words are lowercased.

  • For new studies, ICPSR starts with the title proper provided by the data depositor. Most title propers are straightforward about their contents, such as the 'American Community Survey' or the 'Census of Law Enforcement Training Academies.' Some title propers include a more branded description, such as 'Bridge of Faith: Aim4Peace Community-Based Violence Prevention Project or Contents' and 'Contexts of Cyberbullying: An Epidemiologic Study using Electronic Detection and Social Network Analysis.'

  • For updated studies, ICPSR uses the existing title in production, making changes as necessary to add new years or additional geographical locations. For studies that are part of an ICPSR series, titles remain consistent with the previous series studies.

Geography:

  • All titles include the data collection's geography. If the geography is already included in the title proper, it is not repeated.

  • Cities are paired with state or province names that are spelled out (e.g., Portland, Oregon), unless the city names are unique or well-known.

  • Studies with more than four geographic locations typically are summarized using, for example, '5 countries,' '8 German cities,' '20 U.S. states' instead of listing all locations. In the latter case, 'U.S.' is used rather than 'United States' or 'American'.

  • Descriptors that do not have a distinct geographic area, such as 'communities' or 'regions', are not included in titles.

  • 'Global' may be appropriate for studies where the universe of participants is truly worldwide. Possible examples include online surveys that are not restricted by geography, or studies of organizations, such as NGOs.

  • Brackets are typically not indicated. They are indicated when a study has National, Federal, Congressional, or American in the title. Brackets can be indicated if a non-United States study has "National" in the title, or a similar word specific to that country.

Time Period:

  • All titles include the data collection's time period, which reflects the time period that the data collection covers and should match the Time Period. For example, in the 'Uganda Elite Study, 1964-1968', it is assumed that the Ugandans were surveyed about events in 1964-1968, even if the actual data collection might not have taken place until later.

  • If the time period is already included in the title proper, it is not repeated.

  • For most studies, a single year or range of years is acceptable. Years are written as four digits, including when used in a range (e.g., '1999', '2001-2003', or '1999, 2010, 2015').

  • Months are included only when part of ICPSR series that have multiple releases, which are otherwise identical, each year. In these cases, months are spelled out (e.g., 'September 2020' instead of '9/2020' or 'Sept. 2020').

Examples:

"Bridge of Faith: Aim4Peace Community-Based Violence Prevention Project, Kansas City, Missouri, 2014-2017"
"Health and Relationships Project, United States, 2014-2015"
"Targeted Interventions to Prevent Chronic Low Back Pain in High Risk Patients: A Multi-Site Pragmatic Randomized Controlled Trial (TARGET Trial), 4 U.S. cities, 2016-2019"
"Aid Like A Paycheck (ALAP), Texas and California, 2014-2017"
"COVID-19 Disruptions Disproportionately Affect Female Academics, Global, 2020"

Alternate Titles

Description: The alternate name(s) or acronym(s) commonly used to refer to the data collection.

Required: No

Repeatable: Yes

Accepted Values: Text

Usage Notes: Alternate Title often takes the form of a shortened (by abbreviation or acronym) version of the official title.

Examples:

"Add Health Parent Study"
"FACES 2009"
"Survey of Consumers"
"Eurobarometer 85.2"

Principal Investigators

Description: The key people or organizations responsible for the data collection, listed by importance. Each data collection requires at least one PI, either a person or an organization.

Required: Yes

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: List individuals and organizations that are chiefly responsible for the study across its entire life cycle or made significant intellectual contributions to the research.

Subfields:

Property Required? Repeatable? Accepted Values Description
Person Conditional No Multi-part element; see subfields Name and other details about the principal investigator, if it is an individual person.
Organization Conditional No Multi-part element; see subfields Name and other details about the principal investigator, if it is an organization.
Order Yes No Number The order or rank of importance for the PIs associated with the data collection, typically provided to ICPSR by the lead PI.
Person

Description: Name and other details about the principal investigator, if it is an individual person.

Required: Conditional (must include either Person or Organization)

Repeatable: No

Accepted Values: Multi-part element; for more information, see the Person field

Usage Notes: When entering the name of a principal investigator who is a person:

  • Enter a specific and unique name, for example, by including full names and middle initials where appropriate.
  • Follow commonly accepted, language-appropriate practices for capitalization and punctuation.
  • Within the bounds of these first two principles, follow the PI’s precedent for how their name appears in print.

To determine the preferred form of name to appear in ICPSR’s metadata catalog, consult the following authority sources, in this order. * ICPSR's metadata catalog. If the PI has published data with ICPSR before, especially curated data, use the name as it appears in previous studies. * The PI’s curriculum vitae published on an institutional website. * The Virtual International Authority File (VIAF). * The PI’s Open Researcher and Contributor Identifier (ORCID) record. * The PI’s Google Scholar profile. * The PI’s other published works. * The PI’s bio on their organization’s website.

The given (i.e., 'first') name may include the middle name or initial. If the person only uses an inital for the given name, do not include a space between first and middle initials (e.g., 'E.V.'). The family (i.e., 'last') name can include any suffixes (such as 'II' or 'Jr.'). Abbreviations are discouraged (especially 'et al.').

Whenever possible, add an ORCID for each principal investigator.

When entering a principal investigator's affiliation(s):

  • Enter the PI's affiliation as it appears in the Research Organization Registry (ROR).
  • If the organization doesn't have a ROR ID, enter its full name, avoid acronyms, and do not include departments or colleges. Consult the following sources authority sources to determine the preferred name form.
    • ICPSR’s metadata catalog. If other PIs affiliated with this organization have published data with ICPSR before, especially curated data, use the name as it appears in previous studies.
    • The organization's website.
    • The Virtual International Authority File (VIAF).
  • Enter a PI's affiliation at the time the research was conducted. If the organization's name has changed over time, enter the name that applied at the time the research was conducted.
  • If a PI's affiliation has both English and non-English name forms in ROR or VIAF, select a preferred English language form.
  • If a PI's organizational affiliation is not known, use the term 'Unknown' in the PI Organization element.
  • If multiple PIs (people) are affiliated with the same organization, include the affiliated organization's name for each person.
  • If a PI has multiple affiliations, enter each organization as its own affiliation.
Organization

Description: Name and other details about the principal investigator, if it is an organization.

Required: Conditional (must include either Person or Organization)

Repeatable: No

Accepted Values: Multi-part element; for more information, see the Organization field

Usage Notes: When entering the name of a principal investigator that is an organization:

  • Whenever possible, enter the organization name as it appears in the Research Organization Registry (ROR).
  • If the principal investigator is a department or subunit of an organization that appears in ROR, but does not have its own ROR ID, enter the organization name as it appears in ROR, followed by a period and the name of the department or subunit.
  • If the organization doesn’t have a ROR ID, use its full name and avoid acronyms. Consult the following sources authority sources to determine the preferred name form.
    • ICPSR’s metadata catalog. If the PI has published data with ICPSR before, especially curated data, use the name as it appears in previous studies.
    • The organization's website.
    • The Virtual International Authority File (VIAF).
  • Except for principal investigators that are departments or subunits of organizations in ROR, do not prepend the organization's name with its institutional hierarchy. For example, enter "National Institute on Aging," not "United States Department of Health and Human Services. National Institutes of Health. National Institute on Aging."
  • If the organization's name has changed over time, enter the name that applied at the time the research was conducted.

When selecting a ROR ID, choose the most specific applicable ROR (for example, Inter-university Consortium for Political and Social Research, not University of Michigan).

Order

Description: The order or rank of importance for the PIs associated with the data collection, typically provided to ICPSR by the lead PI.

Required: Yes

Repeatable: No

Accepted Values: Number

Examples:

"0"
"1"
"2"

Complete Principal Investigators Examples (with Subfields):

- "Person":
    "Name":
      "Given": "Miner P."
      "Family": "Marchbanks III"
  "Order": 0
- "Person":
    "Name":
      "Given": "Robert J."
      "Family": "Shiller"
    "Orcid": "https://orcid.org/0009-0006-2316-6486"
    "Affiliations":
    - "Name": "Yale University"
      "Ror": "https://ror.org/03v76x132"
    - "Name": "MacroMarkets"
  "Order": 0

- "Person":
    "Name":
      "Given": "Claudia"
      "Family": "Goldin"
    "Orcid": "https://orcid.org/0000-0003-3842-1604"
    "Affiliations":
    - "Name": "Harvard University"
      "Ror": "https://ror.org/03vek6s52"
  "Order": 1

- "Organization":
    "Name": "Bureau of Justice Statistics"
    "Ror": "https://ror.org/0006s4z66"
  "Order": 2

Funding Sources

Description: The sources of funding that supported the data collection.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Funding Organization Yes No Multi-part element; see subfields Name and other details about the organization that provided the funding.
Funding Awards No Yes Multi-part element; see subfields Identifiers and other details about financial support for the data collection.
Order Yes No Number Internal ICPSR field used to determine the order of importance for the funders associated with the data collection.
Funding Organization

Description: Name and other details about the organization that provided the funding.

Required: Yes

Repeatable: No

Accepted Values: Multi-part element; for more information, see the Organization field

Usage Notes: When entering the name of a funding organization:

  • Whenever possible, enter the organization’s name as it appears in the Research Organization Registry (ROR).
  • If the funding organization is a department or subunit of an organization that appears in ROR, but does not have its own ROR ID, enter the organization name as it appears in ROR, followed by a period and the name of the department or subunit.
  • If the organization doesn't have a ROR ID, use its full name and avoid acronyms. Consult the following sources authority sources to determine the preferred name form.
    • ICPSR’s metadata catalog. If the organization has funded data collections with ICPSR before, especially curated data, use the name as it appears in previous studies.
    • The organization's website.
    • The Virtual International Authority File (VIAF).
  • Except for principal investigators that are departments or subunits of organizations in ROR, do not prepend the organization's name with its institutional hierarchy. For example, enter "National Institute on Aging" instead of "United States Department of Health and Human Services. National Institutes of Health. National Institute on Aging".
  • If the organization's name has changed over time, enter the name that applied at the time the research was conducted.

The Principal Investigator's home institution does not need to be listed as a funding agency unless the PI provides a grant number (or other award information) or makes a specific request.

Funding Awards

Description: Identifiers and other details about financial support for the data collection.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: Whenever possible, provide a grant number for the funding award. If one exists, you can also provide a URL, preferably a persistent one like a digital object identifier (DOI).

Subfields:
Property Required? Repeatable? Accepted Values Description
Funding Identifier Yes No Text The unique identifier for the funding (e.g., ABC-0123456).
Funding URL No No Text A unique identifier (URL), preferably a persistent one like a DOI, linking to a landing page with funding information.
Funding Identifier

Description: The unique identifier for the funding (e.g., ABC-0123456).

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"SES-1835721"
"MDR-8550085"
"40791"
Funding URL

Description: A unique identifier (URL), preferably a persistent one like a DOI, linking to a landing page with funding information.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://doi.org/10.35802/212242"
Order

Description: Internal ICPSR field used to determine the order of importance for the funders associated with the data collection.

Required: Yes

Repeatable: No

Accepted Values: Number

Examples:

"0"
"1"
"2"

Complete Funding Sources Examples (with Subfields):

- "Funding Organization":
    "Name": "Robert Wood Johnson Foundation"
    "Ror": "https://ror.org/02ymmdj85"
  "Funding Awards":
  - "Funding Identifier": "MDR-8550085"
  - "Funding Identifier": "MDR-8550204"
  "Order": 0

- "Funding Organization":
    "Name": "Bureau of Justice Statistics"
    "Ror": "https://ror.org/0006s4z66"
  "Funding Awards":
  - "Funding Identifier": "SES-1835721"
    "Funding URL": "https://doi.org/10.35802/000000"
  "Order": 1
- "Funding Organization":
    "Name": "Acme Foundation"
  "Order": 0

Summary

Description: A description of the data collection that helps users understand its purpose, substance, and key topics.

Required: Yes

Repeatable: No

Accepted Values: Text

Usage Notes: The Summary may include information about the different parts of the data collection not adequately conveyed by the Fileset names or found elsewhere in the metadata. Other important components include a listing of major variables or categories of variables (with examples) as well as an indication of the data collection's unit of analysis (i.e., who or what is being studied: individuals, housing units, courts, criminal acts, etc.). Most often the unit of analysis is the individual; where it is not, it is particularly important to make this clear.

The Summary is written in the third person and avoids attempting to address issues of how the data might be used, who might be interested in the data, or any evaluative comments about the worth or usefulness of the data collection. The Summary uses past tense when describing the process of collecting the data and present tense when necessary, such as when describing the data (e.g., 'The MIDUS Refresher collection is split into two datasets.'). Numerals are used instead of spelling them out; if a number is spelled out for emphasis, the number is attached in parentheses – e.g. 'Two thousand (2,000)'.

Examples:

"In 2014, Chicago Public Schools, looking to reduce the possibility of gun violence among school-aged youth, applied for a grant through the National Institute of Justice. CPS was awarded the Comprehensive School Safety Initiative grant and use said grant to establish the 'Connect and Redirect to Respect' program. This program used student social media data to identify and intervene with students thought to be at higher risk for committing violence. At-risk behaviors included brandishing a weapon, instigating conflict online, signaling gang involvement, and threats towards others. Identified at-risk students would be contacted by a member of the CPS Network Safety Team or the Chicago Police Department's Gang School Safety Team, depending on the risk level of the behavior. To evaluate the efficacy of CRR, the University of Chicago Crime Lab compared outcomes for students enrolled in schools that received the program to outcomes for students enrolled in comparison schools, which did not receive the program. 32 schools were selected for the study, with a total of 44,503 students. Demographic variables included age, race, sex, and ethnicity. Misconduct and academic variables included arrest history, in-school suspensions, out-of-school suspensions, GPA, and attendance days."
"The Health and Relationship Project is a study of both spouses in same-sex and different-sex marriages who were legally married and aged 35 to 65 at the time of data collection (2015). There are two parts of this study: a baseline questionnaire and a daily diary questionnaire completed for 10 consecutive days; both components were completed online and spouses were asked to complete the surveys separately. The baseline questionnaire asks participants about a number of topics related to marriage and health, including stress, health status and health behaviors, relationship quality, and how they have approached health problems in the past. The diary questionnaire asks participants a number of questions about the past 24 hours, including daily stress experiences, social interactions, and health behaviors."

ICPSR Subject Terms

Description: A controlled list of social science terms maintained by ICPSR and used to indicate topics related to the data collection.

Required: Yes

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: This controlled vocabulary was taken from the ICPSR Subject Terms Thesaurus. Source: https://www.icpsr.umich.edu/web/ICPSR/thesaurus/10001.

Subfields:

Property Required? Repeatable? Accepted Values Description
ICPSR Subject Term Yes No Text A human-readable form of the subject term.
ICPSR Subject Term Code Yes No Text A machine-readable/-actionable form of the subject term.
ICPSR Subject Term URI Yes No Text The URI for the subject term.
ICPSR Subject Term

Description: A human-readable form of the subject term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"employment"
"marriage"
"recidivism"
ICPSR Subject Term Code

Description: A machine-readable/-actionable form of the subject term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"25220"
"26180"
"26961"
ICPSR Subject Term URI

Description: The URI for the subject term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"https://www.icpsr.umich.edu/web/ICPSR/thesaurus/10001/terms/25220"
"https://www.icpsr.umich.edu/web/ICPSR/thesaurus/10001/terms/26180"

Complete ICPSR Subject Terms Examples (with Subfields):

- "ICPSR Subject Term": "lobbying"
  "ICPSR Subject Term Code": "26131"
  "ICPSR Subject Term URI": "https://www.icpsr.umich.edu/web/ICPSR/thesaurus/10001/terms/26131"

- "ICPSR Subject Term": "age"
  "ICPSR Subject Term Code": "24123"
  "ICPSR Subject Term URI": "https://www.icpsr.umich.edu/web/ICPSR/thesaurus/10001/terms/24123"
- "ICPSR Subject Term": "happiness"
  "ICPSR Subject Term Code": "25624"
  "ICPSR Subject Term URI": "https://www.icpsr.umich.edu/web/ICPSR/thesaurus/10001/terms/25624"

Journal of Economic Literature (JEL) Classification Codes

Description: Classification codes used to categorize economic research.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: This controlled vocabulary was taken from the American Economic Association's JEL Classifications Codes. Source: https://www.aeaweb.org/jel/guide/jel.php

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the term.
Code Yes No Text A machine-readable/-actionable form of the term.
URI Yes No Text The URI for the JEL classification code.
Label

Description: A human-readable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Relation of Economics to Other Disciplines"
"History of Economic Thought, Methodology, and Heterodox Approaches"
"Economic History: Financial Markets and Institutions: U.S.; Canada: 1913-"
Code

Description: A machine-readable/-actionable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"A12"
"B00"
"N22"
URI

Description: The URI for the JEL classification code.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/jelClassifications/terms/A12"
"/api/v1/vocab-terms/jelClassifications/terms/B00"
"/api/v1/vocab-terms/jelClassifications/terms/N22"

Complete Journal of Economic Literature (JEL) Classification Codes Examples (with Subfields):

- "Label": "Relation of Economics to Other Disciplines"
  "Code": "A12"
  "URI": "/api/v1/vocab-terms/jelClassifications/terms/A12"

- "Label": "History of Economic Thought, Methodology, and Heterodox Approaches"
  "Code": "B00"
  "URI": "/api/v1/vocab-terms/jelClassifications/terms/B00"
- "Label": "Economic History: Financial Markets and Institutions: U.S.; Canada: 1913-"
  "Code": "N22"
  "URI": "/api/v1/vocab-terms/jelClassifications/terms/N22"

Medical Subject Headings (MeSH) Terms

Description: Biomedical and health-related terms from the National Library of Medicine that describe the data collection's topics.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: This controlled vocabulary was taken from the National Library of Medicine's Medical Subject Headings (MeSH). Source: https://www.ncbi.nlm.nih.gov/mesh/

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the subject term.
Code Yes No Text A machine-readable/-actionable form of the subject term.
URI Yes No Text The URI for the subject term as maintained in MeSH.
Label

Description: A human-readable form of the subject term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Anxiety"
"Diabetes Mellitus"
Code

Description: A machine-readable/-actionable form of the subject term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"D001007"
"T011730"
URI

Description: The URI for the subject term as maintained in MeSH.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"http://id.nlm.nih.gov/mesh/D001007"
"http://id.nlm.nih.gov/mesh/T011730"

Complete Medical Subject Headings (MeSH) Terms Examples (with Subfields):

- "Label": "Anxiety"
  "Code": "D001007"
  "URI": "http://id.nlm.nih.gov/mesh/D001007"

- "Label": "Diabetes Mellitus"
  "Code": "T011730"
  "URI": "http://id.nlm.nih.gov/mesh/T011730"

Time Periods

Description: The time period(s) to which the data refer, regardless of when the data were collected.

Required: Yes

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Start Date Yes No Text The start date for the time period the data refer to, formatted as YYYY, YYYY-MM, or YYYY-MM-DD, with no spaces in date expressions.
End Date Yes No Text The end date for the time period the data refer to, formatted as YYYY, YYYY-MM, or YYYY-MM-DD, with no spaces in date expressions.
Time Frame No No Text An optional free-text description of the time period, used for non-numeric dates (e.g., 'Fall 2012') or to add context when multiple dates are present.
Start Date

Description: The start date for the time period the data refer to, formatted as YYYY, YYYY-MM, or YYYY-MM-DD, with no spaces in date expressions.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"2000"
"2019-10"
"2021-03-01"
End Date

Description: The end date for the time period the data refer to, formatted as YYYY, YYYY-MM, or YYYY-MM-DD, with no spaces in date expressions.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"2000"
"2019-10"
"2021-03-01"
Time Frame

Description: An optional free-text description of the time period, used for non-numeric dates (e.g., 'Fall 2012') or to add context when multiple dates are present.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: The textual description ('time frame') is used to add context to the Time Period when multiple time periods exist (e.g., to describe different waves, dataset names, or fiscal year designation) and/or when the date cannot be expressed exclusively through numbers, such as seasons or other units of time where the data producer did not clarify the exact dates they meant.

The textual description should not simply restate the time period in words. For example, if the start and end dates for Time Period are 2020-01, the associated Time Frame should not be 'January 2020'.

Examples:

"Fall 2001"
"Winter Semester 2019"

Complete Time Periods Examples (with Subfields):

- "Start Date": "2018"
  "End Date": "2018"
  "Time Frame": "Summer and Fall 2018"

- "Start Date": "2020-10"
  "End Date": "2020-10"
- "Start Date": "2003-01-01"
  "End Date": "2003-12-31"

Nationally Representative Sample

Description: Indicates whether the data collection uses a sampling design intended to represent the demographics, behaviors, and/or characteristics of the entire nation. This typically involves probability-based methods that allow generalization. It does not include convenience samples that appear similar to the nation by chance.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Yes"
"No"

Geographic Coverage Areas

Description: The geographic locations where the data refer or are related.

Required: Yes

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: When choosing Geographic Coverage Areas:

  • Select the country, state, city, county, region, or continent covered by the study.
  • Spell out place names completely instead of using acronyms. For example, enter "United States" instead of "USA."
  • Type at least four characters to see matches.
  • Choose only the narrowest level of geographic coverage. For example, if you select "Los Angeles, California, United States," do not also add "California, United States" and "United States."
  • For studies with participants from around the world or that are applicable everywhere, select "Earth."

Geographic locations are drawn from the GeoNames geographical database. Source: https://www.geonames.org/. Allowable feature codes include:

  • From Feature Class A (country, state, region,... – e.g., Administrative Divisions):

    • ADM1 (first-order administrative division – e.g., US states, Canadian provinces, etc.)
    • ADM2 (second-order administrative division – e.g. US counties)
    • PCLI (independent political entity – e.g., countries)
    • PCLD (dependent political entity – e.g., Puerto Rico and Guam)
    • PCLF (freely associated state – e.g., Palau, Micronesia, and Marshall Islands)
    • PCLH (historical political entity – e.g., former entities like Yugoslavia and USSR)
    • PCLS (semi-independent political entity – e.g., Palestine, Macao, and Hong Kong)
    • PCL (political entity – e.g., Guernsey, Jersey, and Isle of Man)
    • TERR (territory – e.g., American Samoa, Svalbard and Jan Mayen, etc.)
    • ZN (zone – e.g., European Union, Commonwealth of Nations, and NATO)
  • From Feature Class P (city, village,... – e.g., Populated Places)

    • PPLG (seat of government of a political entity)
    • PPLC (capital of a political entity)
    • PPLA (seat of a first-order administrative division)
    • PPLA2 (seat of a second-order administrative division)
    • PPL (populated place)
  • From Feature Class L (parks,area, ..)

    • RGN (region)
    • CONT (continent)

Subfields:

Property Required? Repeatable? Accepted Values Description
City No No Text A town, city, or similar populated place covered in the data collection
County No No Text A United States county or similar administrative area covered in the data collection
State No No Text A state, province, canton or similar political entity covered in the data collection
Country No No Text A country covered in the data collection
Region No No Text An area distinguished by one or more observable physical or cultural characteristics that is covered in the data collection.
Continent No No Text A continent covered in the data collection
Other Geographic Area No No Text An area covered in the data collection that cannot be represented using the defined categories above or matched to an appropriate GeoNames record.
URI No No Text A local unique identifier for the geographic coverage area.
External URI No No Text The GeoNames unique identifier for the geographic coverage area.
City

Description: A town, city, or similar populated place covered in the data collection

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Ann Arbor"
"Hanover"
"Chongqing"
County

Description: A United States county or similar administrative area covered in the data collection

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Monroe County"
"Washtenaw County"
"Cuyahoga County"
State

Description: A state, province, canton or similar political entity covered in the data collection

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Michigan"
"Manitoba"
"Yunnan"
Country

Description: A country covered in the data collection

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"United States"
"China"
"Ghana"
Region

Description: An area distinguished by one or more observable physical or cultural characteristics that is covered in the data collection.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Sub-Saharan Africa"
"Eastern Europe"
"Siberia"
Continent

Description: A continent covered in the data collection

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Africa"
"Asia"
"South America"
Other Geographic Area

Description: An area covered in the data collection that cannot be represented using the defined categories above or matched to an appropriate GeoNames record.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: Use this for user-provided terms, loosely defined geographic concepts, GeoNames feature types not covered by city/county/state/country/region/continent, or historical geographic entities (e.g., Prussia) not represented in GeoNames.

Examples:

"Global"
"Eurasia"
"13 U.S. states in 3 regions"
URI

Description: A local unique identifier for the geographic coverage area.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/geoNames/terms/6252001"
"/api/v1/vocab-terms/geoNames/terms/6269554"
External URI

Description: The GeoNames unique identifier for the geographic coverage area.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://sws.geonames.org/4990729/"
"https://sws.geonames.org/6269554"

Complete Geographic Coverage Areas Examples (with Subfields):

- "City": "Cleveland"
  "State": "Ohio"
  "Country": "United States"
  "Continent": "North America"
  "External URI": "https://sws.geonames.org/5150529"
  "URI": "/api/v1/vocab-terms/geoNames/terms/5150529"

- "County": "Washtenaw County"
  "State": "Michigan"
  "Country": "United States"
  "Continent": "North America"
  "External URI": "https://sws.geonames.org/5014120"
  "URI": "/api/v1/vocab-terms/geoNames/terms/5014120"

- "State": "Pennsylvania"
  "Country": "United States"
  "Continent": "North America"
  "External URI": "https://sws.geonames.org/5206379"
  "URI": "/api/v1/vocab-terms/geoNames/terms/5206379"
- "Country": "Germany"
  "Continent": "Europe"
  "External URI": "https://sws.geonames.org/2921044"
  "URI": "/api/v1/vocab-terms/geoNames/terms/2921044"

- "Continent": "Africa"
  "External URI": "https://sws.geonames.org/6255146"
  "URI": "/api/v1/vocab-terms/geoNames/terms/6255146"
- "Other Geographic Area": "Global"
  "External URI": "https://sws.geonames.org/6295630"
  "URI": "/api/v1/vocab-terms/geoNames/terms/6295630"
- "Other Geographic Area": "13 U.S. states in 3 regions"

Smallest Geographic Unit

Description: The smallest geographic unit (e.g., state or census tract) used in the dataset.

Required: No

Repeatable: No

Accepted Values: Multi-part element; see subfields

Usage Notes: Smallest Geographic Unit is intended to represent specific, known geography – e.g., county, census district, Zip code, electoral district, etc. – that is represented by a variable.

If the data do not include a geographic variable by which the data can be analyzed, this element is not indicated. If all the cases are from a single state, but the cases are not subdivided geographically within that state, then 'state' is not indicated.

If there is a variable indicating which testing site a survey was taken at, but the locations of the testing sites were masked by the PI, this element is likely not indicated.

This field employs a local ICPSR controlled vocabulary; see below for terms and definitions:

Term Definition
Geocoded Location A precise geographic point derived from an address, typically represented as coordinates or address strings.
Parcel A discrete use of land ownership, often defined in property records or tax assessments.
Grid Cell A unit of spatial data that divides an area into rectangular, square intervals (e.g., 1km x 1km grid), typically used in mapping or environmental studies.
Postal Code/Zip Code A geographic area defined by postal delivery routes or regions, used for organizing mail delivery.
Neighborhood/Community Area An informally defined area within a city, usually based on local recognition rather than official administrative boundaries.
City/Municipality A local government jurisdiction that covers urban areas, which can range from large cities to small towns and villages.
County/District/Parish A geographic area that is part of a state or province (e.g., parishes in Louisiana, boroughs in Alaska).
State/Province A major administrative division within a country. In the U.S., this includes the 50 states and the District of Columbia. Other countries, like Canada and Australia, have provinces or states (e.g., Ontario in Canada, New South Wales in Australia).
Territory A region under the jurisdiction of a national government, but not a fully self-governing state or province (e.g., Puerto Rico, Northwest Territories, Falkland Islands).
Country A sovereign nation or territory that is recognized as an independent political entity, such as the United States, Canada, or France.
Census Block The smallest geographic unit used in national censuses, often corresponding to a city block or small neighborhood.
Census Block Group A collection of adjacent census blocks—typically all blocks within part of a census tract.
Census Tract A small geographic unit used in national censuses, typically representing 2,500 to 8,000 people. Census tracts are designed to provide detailed statistical data for neighborhoods or communities.
Census Division A larger geographic area used for statistical reporting, grouping states or provinces within a country. Census divisions are smaller than regions but larger than individual states or provinces.
Census Region A broader grouping of census divisions used to organize and report data at a national level (e.g., Northeast, Midwest, South, West).
Public Use Microdata Area (PUMA) A geographic area with a population of 100,000 or more, used for the release of detailed public-use microdata from the U.S. Census.
Core-Based Statistical Area (CBSA) A term that includes both Metropolitan and Micropolitan Statistical Areas. These areas are based on urban centers and their surrounding communities as defined by the U.S. Office of Management and Budget (OMB).
Metropolitan Statistical Area (MSA) A Core-Based Statistical Area (CBSA) that includes an urban core with a population of 50,000 or more.
Micropolitan Statistical Area A Core-Based Statistical Area (CBSA) that includes an urban core population of at least 10,000 but less than 50,000.
ZIP Code Tabulation Area (ZCTA) A geographic area created by the U.S. Census Bureau to approximate the boundaries of ZIP Codes for demographic analysis.
Voting District/Precinct A geographic area used for organizing elections, often serving as the smallest electoral units where voters cast their ballots.
Congressional District A geographic area used for electing representatives to federal or state legislative offices in the United States.
Federal Court District A geographic area where a U.S. District Court has jurisdiction to hear and decide federal cases.
School District The administrative boundary for local education systems, typically overseeing public schools from elementary through secondary levels.
Indigenous/Tribal Lands An area legally recognized as an Indigenous or tribal nation, often with unique legal, cultural, or sovereignty status.

Subfields:

Property Required? Repeatable? Accepted Values Description
Label No No Text A human-readable form of the term.
Code No No Text A machine-readable/-actionable form of the term.
URI No No Text The URI for the term.
Label

Description: A human-readable form of the term.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Basic Geographic Units"
"Postal Code/Zip Code"
"State/Province"
Code

Description: A machine-readable/-actionable form of the term.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"BasicUnits"
"PostalCodeZipCode"
"StateProvince"
URI

Description: The URI for the term.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/smallestGeographicUnits/terms/BasicUnits"
"/api/v1/vocab-terms/smallestGeographicUnits/terms/PostalCodeZipCode"
"/api/v1/vocab-terms/smallestGeographicUnits/terms/StateProvince"

Complete Smallest Geographic Unit Examples (with Subfields):

"Label": "Basic Geographic Units"
"Code": "BasicUnits"
"URI": "/api/v1/vocab-terms/smallestGeographicUnits/terms/BasicUnits"
"Label": "Postal Code/Zip Code"
"Code": "PostalCodeZipCode"
"URI": "/api/v1/vocab-terms/smallestGeographicUnits/terms/PostalCodeZipCode"
"Label": "State/Province"
"Code": "StateProvince"
"URI": "/api/v1/vocab-terms/smallestGeographicUnits/terms/StateProvince"

Study Design

Description: The procedures used to contact participants and gather data.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: The Study Design provides more detailed information than the Summary, including how surveys were prepared and administered, how interviews were conducted, or how the data were obtained and compiled, as well as information about deadlines and follow-ups to respondents.

Examples:

"Data on organizational culture in each of the 12 courts (Part 1) were obtained by administering the Court Culture Assessment Instrument (CCAI) to all judges with a felony criminal court docket and to all senior court administrators. A total of 224 respondents completed the questionnaire. The CCAI was used to assess five key dimensions of current court culture orientation: (1) dominant case management style, (2) judicial and court staff relations, (3) change management, (4) courthouse leadership, and (5) internal organization. The determination of what culture judges and court administrators desired to establish in the near future was also obtained through the application of the same instrument (CACI) as practitioners were asked to indicate the type of culture in each work area (or content dimension) they would like to see in their court in the next five years."

Universe

Description: The total group of persons or other entities (e.g., households or organizations) that were the object of research and to which analytic results refer.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: Age, nationality, and residence commonly help to delineate a given universe, but any of a number of factors may be involved, such as sex, race, income, veteran status, criminal convictions, etc. The Universe may consist of elements other than persons, such as housing units, court cases, deaths, countries, etc. It should be possible to tell from the description of the universe whether a given individual or element (hypothetical or real) is a member of the population under study. Typically, the Universe statement is about one sentence or shorter, and reflects the entire possible population a data collection sought to study.

Examples:

"All households in the United States with phones."
"Part 1: Thirty cities in Massachusetts during 1980-1986. Parts 2-4: All residents in Massachusetts during 1986."
"Individuals self-identified as transgender, trans, genderqueer, non-binary, or other identities on the transgender identity spectrum aged 18 and older residing in the fifty U.S. states, the District of Columbia, American Samoa, Guam, Puerto Rico, and U.S. military bases overseas."
"Jihadists from the United States and Canada, along with Incels from Germany, Canada, the United States, and United Kingdom."
"All publicly funded medical examiner and coroner offices."
"Uncertified ballots for the 2000 United States presidential election in Florida."

Time Methods

Description: The methods used to collect data over time, like snapshots at one point (cross-sectional) or repeatedly (longitudinal) to study changes or trends.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: This controlled vocabulary was taken from the DDI Alliance. Source: DDI Alliance CV TimeMethod https://rdf-vocabulary.ddialliance.org/ddi-cv/TimeMethod/1.2.3/TimeMethod.html.

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the term.
Code Yes No Text A machine-readable/-actionable form of the term.
URI Yes No Text The URI for the term.
Label

Description: A human-readable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Cross-section"
"Longitudinal: Panel"
"Time series"
Code

Description: A machine-readable/-actionable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"CrossSection"
"Longitudinal.Panel"
"TimeSeries"
URI

Description: The URI for the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/timeMethods/terms/CrossSection"
"/api/v1/vocab-terms/timeMethods/terms/Longitudinal.Panel"
"/api/v1/vocab-terms/timeMethods/terms/TimeSeries"

Complete Time Methods Examples (with Subfields):

- "Label": "Cross-section"
  "Code": "CrossSection"
  "URI": "/api/v1/vocab-terms/timeMethods/terms/CrossSection"

- "Label": "Longitudinal: Panel"
  "Code": "Longitudinal.Panel"
  "URI": "/api/v1/vocab-terms/timeMethods/terms/Longitudinal.Panel"
- "Label": "Time series"
  "Code": "TimeSeries"
  "URI": "/api/v1/vocab-terms/timeMethods/terms/TimeSeries"

Units of Analysis

Description: The object(s) of analysis for the data collection, such as an organization, individual, or household.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: This controlled vocabulary was taken from the DDI Alliance. Source: DDI Alliance CV AnalysisUnit https://rdf-vocabulary.ddialliance.org/ddi-cv/AnalysisUnit/2.1.3/AnalysisUnit.html.

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the term.
Code Yes No Text A machine-readable/-actionable form of the term.
URI Yes No Text The URI for the term.
Label

Description: A human-readable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Organization/Institution"
"Individual"
"Household"
Code

Description: A machine-readable/-actionable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"OrganizationOrInstitution"
"Individual"
"Household"
URI

Description: The URI for the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/analysisUnits/OrganizationOrInstitution"
"/api/v1/vocab-terms/analysisUnits/Individual"
"/api/v1/vocab-terms/analysisUnits/Household"

Complete Units of Analysis Examples (with Subfields):

- "Label": "Organization/Institution"
  "Code": "OrganizationOrInstitution"
  "URI": "/api/v1/vocab-terms/analysisUnits/OrganizationOrInstitution"

- "Label": "Individual"
  "Code": "Individual"
  "URI": "/api/v1/vocab-terms/analysisUnits/Individual"
- "Label": "Household"
  "Code": "Household"
  "URI": "/api/v1/vocab-terms/analysisUnits/Household"

Sampling Procedures

Description: The type(s) of sample and sample design used to select survey respondents to represent the population.

Required: No

Repeatable: Yes

Accepted Values: Text

Usage Notes: The sample is a selection out of the universe of all possible relevant cases (e.g., adults in the United States, housing units in three counties of Michigan, etc.) that could have been included in the data collection. Note that some studies, such as censuses, do not utilize samples but include all members of the universe.

This controlled vocabulary was taken from the DDI Alliance. Source: DDI Alliance CV SamplingProcedure https://rdf-vocabulary.ddialliance.org/ddi-cv/SamplingProcedure/1.1.4/SamplingProcedure.html

Examples:

- "Label": "Probability: Systematic random"
  "Code": "Probability.SystematicRandom"
  "Uri": "/api/v1/vocab-terms/samplingProcedures/terms/Probability.SystematicRandom"

- "Label": "Theoretical Sampling"
  "Code": "TheoreticalSampling"
  "Uri": "/api/v1/vocab-terms/samplingProcedures/terms/TheoreticalSampling"
- "Label": "Total universe/Complete enumeration"
  "Code": "TotalUniverseCompleteEnumeration"
  "Uri": "/api/v1/vocab-terms/samplingProcedures/terms/TotalUniverseCompleteEnumeration"

Sampling Note

Description: Supplemental information about the sampling process that does not fit neatly into the Sampling Procedure field.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: A detailed discussion of such things as sampling error or other limitations of the sampling methodology is not required here.

Examples:

"National sample of telephone numbers from cell (RDD) sampling frame."
"The probability sample selected to represent the universe consists of approximately 71,000 households."

Weights

Description: The weight variables and the criteria for using them in data analysis, or other information about how the data are weighted if no weight variables are present.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: Weight includes any information about weighting variables in the data, as well as any other weight information provided by the Principal Investigator. If a weighting formula or coefficient was developed, provide this formula, define its elements, and indicate how the formula is applied to the data. It is acceptable to summarize additional documentation and refer users to those resources for more information.

Examples:

"Both the TransPop and Cisgender datasets have the same variable named WEIGHT as the weighting variable. The combination datasets have a set of three weight variables (WEIGHT_TRANSPOP, WEIGHT_CISGENDER, WEIGHT_CISGENDER_TRANSPOP)"
"A weight variable with two implied decimal places has been included and must be used in any analysis."

Response Rates

Description: The percentage of respondents in the sample who participated in the data collection.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: This field is only applicable if the data were collected with a survey instrument and the response rates are provided.

Examples:

"The overall response rate for this survey was 20.22%; 72.6% for existing panelists and 10.4% for new panelists, using AAPOR Response Rate 1."
"Of the 1,843 Midlife in the United States (MIDUS) respondents that researchers attempted to contact, 1,483 agreed to participate (8 percent refused participation and 11 percent either moved or were difficult to contact), yielding a response rate of approximately 81 percent."

Data Source Types

Description: The source(s) of the data as collected by the Principal Investigators.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: People, things, and other data can all be Data Source Types. This controlled vocabulary was taken from the DDI Alliance. Source: DDI Alliance CV DataSourceType https://rdf-vocabulary.ddialliance.org/ddi-cv/DataSourceType/1.0.2/DataSourceType.html.

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the term.
Code Yes No Text A machine-readable/-actionable form of the term.
URI Yes No Text The URI for the term.
Label

Description: A human-readable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Registers/Records/Accounts: Medical/Clinical"
"Events/Interactions"
"Research data: Published"
Code

Description: A machine-readable/-actionable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"RegistersRecordsAccounts.MedicalClinical"
"EventsInteractions"
"ResearchData.Published"
URI

Description: The URI for the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/dataSourceTypes/terms/RegistersRecordsAccounts.MedicalClinical"
"/api/v1/vocab-terms/dataSourceTypes/terms/EventsInteractions"
"/api/v1/vocab-terms/dataSourceTypes/terms/ResearchData.Published"

Complete Data Source Types Examples (with Subfields):

- "Label": "Registers/Records/Accounts: Medical/Clinical"
  "Code": "RegistersRecordsAccounts.MedicalClinical"
  "URI": "/api/v1/vocab-terms/dataSourceTypes/terms/RegistersRecordsAccounts.MedicalClinical"

- "Label": "Events/Interactions"
  "Code": "EventsInteractions"
  "URI": "/api/v1/vocab-terms/dataSourceTypes/terms/EventsInteractions"
- "Label": "Research data: Published"
  "Code": "ResearchData.Published"
  "URI": "/api/v1/vocab-terms/dataSourceTypes/terms/ResearchData.Published"

External Data Sources

Description: The source of the data, when that source is external to the data collection and can be independently cited.

Required: No

Repeatable: Yes

Accepted Values: Text

Usage Notes: External data sources can include websites, datasets, books, journal articles, and other sources. Each source includes at minimum the title, author, publication year, journal (if applicable), and DOI or URL for online sources. Any citation format is accepted.

Examples:

"'Voting Scores.' Congressional Quarterly Almanac 33 (1977), 487-498"
"Multi-Resolution Land Characteristics Consortium. "National Land Cover Database (CONUS), All Years," 2016. https://www.mrlc.gov/data/nlcd-land-cover-conus-all-years"
"Data file 1: United States Census Bureau (2010). TIGER/Line shapefiles, 2010 census tracts (2010 version) [Data set]. https://www2.census.gov/geo/tiger/TIGER2010/TRACT/2010/tl_2010_01_tract10.zip"

Collection Modes

Description: The method(s) or procedure(s) used to collect the data, such as an interview or experiment.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: This controlled vocabulary was taken from the DDI Alliance. Source: DDI Alliance CV ModeOfCollection https://rdf-vocabulary.ddialliance.org/ddi-cv/ModeOfCollection/4.0.3/ModeOfCollection.html.

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the term.
Code Yes No Text A machine-readable/-actionable form of the term.
URI Yes No Text The URI for the term.
Label

Description: A human-readable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Face-to-face interview: Computer-assisted (CAPI/CAMI)"
"Measurements and tests"
"Computer-based observation"
Code

Description: A machine-readable/-actionable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Interview.FaceToFace.CAPIorCAMI"
"MeasurementsAndTests"
"Observation.ComputerBased"
URI

Description: The URI for the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/collectionModes/terms/Interview.FaceToFace.CAPIorCAMI"

Complete Collection Modes Examples (with Subfields):

- "Label": "Face-to-face interview: Computer-assisted (CAPI/CAMI)"
  "Code": "Interview.FaceToFace.CAPIorCAMI"
  "URI": "/api/v1/vocab-terms/collectionModes/terms/Interview.FaceToFace.CAPIorCAMI"
- "Label": "Measurements and tests"
  "Code": "MeasurementsAndTests"
  "URI": "/api/v1/vocab-terms/collectionModes/terms/MeasurementsAndTests"

- "Label": "Computer-based observation"
  "Code": "Observation.ComputerBased"
  "URI": "/api/v1/vocab-terms/collectionModes/terms/Observation.ComputerBased"

Collection Dates

Description: The date(s) data collection took place.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Start Date Yes No Text The start date of the data collection period. Must be in YYYY-MM-DD, YYYY-MM, or YYYY format with no spaces.
End Date Yes No Text The end date of the data collection period. Must be in YYYY-MM-DD, YYYY-MM, or YYYY format with no spaces.
Time Frame No No Text An optional free-text description of the data collection period, used for non-numeric dates (e.g., 'Fall 2012') or to add context when multiple dates are present.
Start Date

Description: The start date of the data collection period. Must be in YYYY-MM-DD, YYYY-MM, or YYYY format with no spaces.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"2000"
"2019-10"
"2021-03-01"
End Date

Description: The end date of the data collection period. Must be in YYYY-MM-DD, YYYY-MM, or YYYY format with no spaces.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"2000"
"2019-10"
"2021-03-01"
Time Frame

Description: An optional free-text description of the data collection period, used for non-numeric dates (e.g., 'Fall 2012') or to add context when multiple dates are present.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: The textual description ('time frame') is used to add context to the Collection Date when multiple time periods exist (e.g., to describe different study waves, dataset names, or fiscal year designation) and/or when the date cannot be expressed exclusively through numbers, such as seasons or other units of time where the data producer did not clarify the exact dates they meant.

The textual description should not simply restate the time period in words. For example, if the Collection Date is 2020-01, the Time Frame should not be 'January 2020'.

Examples:

"Fall 2001"
"Student data"

Complete Collection Dates Examples (with Subfields):

- "Start Date": "2018"
  "End Date": "2018"
  "Time Frame": "Wave 1"

- "Start Date": "2020-10"
  "End Date": "2020-10"
  "Time Frame": "Wave 2"
- "Start Date": "2003-01-01"
  "End Date": "2003-12-31"

Variable Description

Description: Significant variables (particularly demographic variables) in the data files.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: The Variable Description provides more detailed information than the Summary, including a review of variables that are important for users to know about. The codebook, setup files, and variable groups are appropriate sources of information for Variable Description.

Examples:

"The data includes variables about participants' and their parents' moods, interviewer observations, families' activities, families' health history, participants' school records, and parents' substance use. Demographic variables include race, religion, annual household income, and the participants' parents' employment statuses."
"The LGBTQ Hate Crimes Interviews dataset contains more in-depth information, including victim demographic information, substance abuse history, information on whether the victim is open about their LGBTQ identification, the victim's job status, and information about how the victim reacted to the crime, such as whether or not they reported the crime to the police and their level of cooperation with the police and prosecution."

Scales

Description: Any commonly known scales, measures, or inventories used in the data collection.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: Include common scales that can be readily identified from the data, documentation, or other related materials. Examples of common scales include the Minnesota Multiphasic Personality Inventory (MMPI) and the Consumer Price Index (CPI). ICPSR curators are not expected to infer or research scales that are not explicitly indicated. The scales can be cited either as a list or described in full sentences and include DOIs or URLs whenever possible. If the questionnaire used has a finite list of responses (e.g., 'Always, Sometimes, Rarely, Never' or 'Strongly Agree, Agree, Disagree, Strongly Disagree'), it is acceptable for this element to note 'A Likert-type scale was used,' or 'Several Likert-type scales were used.' However, it is not required to note Likert-type scales in situations where only such scales were used, given their ubiquity.

Examples:

"The baseline data collection included one scale - the CES-D index for maternal depression [Cole, J. C., Rabin, A. S., Smith, T. L., and Kaufman, A. S. (2004). Development and validation of a Rasch-derived CES-D short form. Psychological assessment, 16(4), 360. https://doi.org/10.1037/1040-3590.16.4.360]. All scales used for outcomes at ages 1 through 3 are listed in Appendix Tables 1 and 2 in the User Guide. Please refer to the User Guide and P.I. Codebook, available under the 'Data and Documentation' tab, for details."
"Squires, J., Bricker, D. D., and Twombly, E. (2009). Ages and stages questionnaires. Baltimore, MD: Paul H. Brookes."
"Briggs-Gowan, M. J., Carter, A. S., Irwin, J. R., Wachtel, K., and Cicchetti, D. V. (2004). The Brief Infant-Toddler Social and Emotional Assessment: screening for social-emotional problems and delays in competence. Journal of pediatric psychology, 29(2), 143-155. https://doi.org/10.1093/jpepsy/jsh017"
"Yu, L., Buysse, D. J., Germain, A., Moul, D. E., Stover, A., Dodds, N. E., ... and Pilkonis, P. A. (2012). Development of short forms from the PROMIS sleep disturbance and sleep-related impairment item banks. Behavioral sleep medicine, 10(1), 6-24. https://doi.org/10.1080/15402002.2012.636266"

Data Management Plan

Description: A link to the data management plan (preferably a persistent identifier such as a DOI).

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://doi.org/10.48321/D1EA6EF78D"
"https://rdm.mcmaster.ca/dmps/promoting-healthy-families-data-management-plan"

Preregistration

Description: A link to a research plan for the data collection (preferably a persistent identifier such as a DOI).

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://doi.org/10.17605/OSF.IO/67DUT"
"https://doi.org/10.1257/rct.15789-1.0"

Software Applications

Description: Software used by the principal investigator(s) to collect or analyze data, required to understand how the data were obtained or to reproduce results.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Software Name Yes No Text The name of the software application.
Software Version No No Text The version of the application.
Software Description No No Text Short description or overview of the application and its intended purpose
Programming Languages No Yes Text The programming language(s) used in the development of the application
Operating Systems No Yes Text Computer operating systems supported by the application
Memory Requirements No No Text Minimum memory (e.g., RAM) requirements to operate the application
Processor Requirements No No Text Processor architecture required to run the application
Software Requirements No No Text Required components for the application, like runtime environments and shared libraries not included in the package but needed to run it.
Storage Requirements No No Text Amount of storage space required by the application
Device Requirements No No Text Device required to run the application. Used in cases where a specific make/model is required to run the application
License No No Text The license associated with the application, preferably expressed as a URL.
Download URL No No Text A direct link to a downloadable software artifact (e.g., executable, package, archive, or single script file) that retrieves the application itself, without additional navigation or instructions.
Installation URL No No Text A link to a repository or project landing page where users can obtain resources and instructions to install the application (as opposed to directly downloading a single file).
Software Name

Description: The name of the software application.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"JHOVE"
"ffmpeg"
"json-schema-for-humans"
Software Version

Description: The version of the application.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"1"
"2.0.4"
"Auto-Build 2023-01-15 12:36"
Software Description

Description: Short description or overview of the application and its intended purpose

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"JHOVE, the JSTOR/Harvard Object Validation Environment, is an extensible software framework for performing format identification, validation, and characterization of digital objects."
"ffmpeg is a very fast video and audio converter that can also grab from a live audio/video source. It can also convert between arbitrary sample rates and resize video on the fly with a high quality polyphase filter."
Programming Languages

Description: The programming language(s) used in the development of the application

Required: No

Repeatable: Yes

Accepted Values: Text

Examples:

"python"
"shell"
"r"
"other"
Operating Systems

Description: Computer operating systems supported by the application

Required: No

Repeatable: Yes

Accepted Values: Text

Examples:

"windows"
"windows"
"mac"
"linux"
"other"
Memory Requirements

Description: Minimum memory (e.g., RAM) requirements to operate the application

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"4 GB"
"1GB of RAM (2GB for a 64-bit version)"
"4 GB of GPU memory for HD and some 4K media; 6 GB or more for 4K and higher"
Processor Requirements

Description: Processor architecture required to run the application

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Intel i5/ i7/ Ryzen 7"
"Minimum 1 GHz; Recommended 2GHz or more"
"2.5–2.9 GHz or faster processor"
Software Requirements

Description: Required components for the application, like runtime environments and shared libraries not included in the package but needed to run it.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"Java runtime environment"
"Requires additional Python libraries: numpy, v1.11.2; scipy, v0.18.1, and pandas, v0.19.0"
"Compile with GNU auto tools"
Storage Requirements

Description: Amount of storage space required by the application

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"3.5 GB for new installations, 5 GB for upgrades (including temporary files required during installation)"
"15 GB of free disk space"
"8 GB of available hard-disk space for installation; additional free space required during installation"
Device Requirements

Description: Device required to run the application. Used in cases where a specific make/model is required to run the application

Required: No

Repeatable: No

Accepted Values: Text

Examples:

License

Description: The license associated with the application, preferably expressed as a URL.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://www.apache.org/licenses/LICENSE-2.0"
"https://opensource.org/licenses/LGPL-2.0"
Download URL

Description: A direct link to a downloadable software artifact (e.g., executable, package, archive, or single script file) that retrieves the application itself, without additional navigation or instructions.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://github.com/richardlehane/siegfried/archive/refs/heads/main.zip"
"https://cdn.nationalarchives.gov.uk/documents/droid-binary-6.5.2-bin-win32-with-jre.zip"
Installation URL

Description: A link to a repository or project landing page where users can obtain resources and instructions to install the application (as opposed to directly downloading a single file).

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://github.com/richardlehane/siegfried"
"https://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/droid/"

Complete Software Applications Examples (with Subfields):

- "Software Name": "siegfried"
  "Software Version": "1.11.1"
  "Software Description": "Siegfried is a signature-based file format identification\
    \ tool, implementing the National Archives UK's PRONOM file format signatures;\
    \ freedesktop.org's MIME-info file format signatures; the Library of Congress's\
    \ FDD file format signatures (beta); and Wikidata (beta)."
  "Programming Languages":
  - "go"
  - "javascript"
  - "other"
  "Operating Systems":
  - "mac"
  - "linux"
  - "windows"
  "License": "https://www.apache.org/licenses/LICENSE-2.0"
  "Download URL": "https://github.com/richardlehane/siegfried/archive/refs/heads/main.zip"

General Data Formats

Description: The file format types present in the data collection.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Usage Notes: This controlled vocabulary was taken from the DDI Alliance. Source: DDI Alliance CV GeneralDataFormat https://rdf-vocabulary.ddialliance.org/ddi-cv/GeneralDataFormat/2.0.3/GeneralDataFormat.html.

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the term.
Code Yes No Text A machine-readable/-actionable form of the term.
URI Yes No Text The URI for the term.
Label

Description: A human-readable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Text"
"Still image"
"Numeric"
Code

Description: A machine-readable/-actionable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Text"
"StillImage"
"Numeric"
URI

Description: The URI for the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Complete General Data Formats Examples (with Subfields):

- "Label": "Text"
  "Code": "Text"
  "URI": "/api/v1/vocab-terms/generalDataFormats/terms/Text"

- "Label": "Still image"
  "Code": "StillImage"
  "URI": "/api/v1/vocab-terms/generalDataFormats/terms/StillImage"
- "Label": "Numeric"
  "Code": "Numeric"
  "URI": "/api/v1/vocab-terms/generalDataFormats/terms/Numeric"

Notes

Description: Important details about the data collection (like unique authoring, discrepancies, or processing information) that can't be recorded in other metadata elements.

Required: No

Repeatable: Yes

Accepted Values: Text

Usage Notes: Notes should include any information that does not fit anywhere else in the metadata, such as: information about unique aspects of the way the data was processed, discrepancies between the metadata and documentation files, information about the research team, or series-specific notes.

Examples:

"Information on the Index of Consumer Sentiment, the Index of Current Economic Conditions, and the Index of Consumer Expectations and how they were created can be found in the P.I. Codebook"
"Dataset 1 should be attributed to Jane Doe while datasets 2-6 should be attributed to John Doe"
"Additional information on the Survey of Consumers can be found by visiting the Survey of Consumers Website"

Manuscript Number

Description: A unique identifier that associates the data collection with a manuscript submitted to a journal.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"ECIN-Mar-2025-0078.R2"
"AER-2019-0000"

ADA Accessibility

Description: Indicates whether the data collection is ADA accessible, conforming to WCAG 2.1 AA standards, or qualifies for the ADA archival exception.

Required: No

Repeatable: No

Accepted Values: Multi-part element; see subfields

Usage Notes: This field employs a local ICPSR controlled vocabulary; see below for terms and definitions:

Term Definition
ADA Accessible The item is ADA accessible, conforming to WCAG 2.1 AA standards.
ADA Archival The item is not ADA accessible, but qualifies for the ADA archival exception.

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the term.
Code Yes No Text A machine-readable/-actionable form of the term.
URI Yes No Text The URI for the term.
Label

Description: A human-readable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"ADA Accessible"
"ADA Archival"
Code

Description: A machine-readable/-actionable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"ada.accessible"
"ada.archival"
URI

Description: The URI for the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/adaAccessibility/terms/ada.accessible"
"/api/v1/vocab-terms/adaAccessibility/terms/ada.archival"

Complete ADA Accessibility Examples (with Subfields):

"Label": "ADA Accessible"
"Code": "ada.accessible"
"URI": "/api/v1/vocab-terms/adaAccessibility/terms/ada.accessible"
"Label": "ADA Archival"
"Code": "ada.archival"
"URI": "/api/v1/vocab-terms/adaAccessibility/terms/ada.archival"

License

Description: A license governing the data's use.

Required: No

Repeatable: No

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Label Yes No Text A human-readable form of the term.
Code Yes No Text A machine-readable/-actionable form of the term.
URI Yes No Text The URI for the term.
Label

Description: A human-readable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Creative Commons Attribution 4.0 International"
"Apache License 1.0"
Code

Description: A machine-readable/-actionable form of the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"CC-BY-NC-4.0"
"Apache-1.0"
URI

Description: The URI for the term.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"/api/v1/vocab-terms/licenses/terms/CC-BY-4.0"
"/api/v1/vocab-terms/licenses/terms/Apache-1.0"

Complete License Examples (with Subfields):

"Label": "Creative Commons Attribution 4.0 International"
"Code": "CC-BY-NC-4.0"
"URI": "/api/v1/vocab-terms/licenses/terms/CC-BY-4.0"
"Label": "Apache License 1.0"
"Code": "Apache-1.0"
"URI": "/api/v1/vocab-terms/licenses/terms/Apache-1.0"

Version History

Description: A record of how the data collection has changed over time.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Version Number No No Text A version number for a study.
Version Date No No Text The date on which a given version of a data collection was released.
Version Note No No Text Provenance information about a given version of the data collection.
Version Number

Description: A version number for a study.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: Every ICPSR data collection is assigned version 1.0 when it is first published. When the data collection is updated, a new version number is assigned. For substantive changes to the data collection, including changes to data files, title, or principal investigators, a new major version is created, the version number increases by 1 (for example, from 1.0 to 2.0), and a new version-specific digital object identifier (DOI) is created. For all other changes, a new minor version is created, the version number increases by 0.1 (for example, from 2.0 to 2.1), and the DOI does not change.

Examples:

"V1"
"V2.1"
"V3.2"
Version Date

Description: The date on which a given version of a data collection was released.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: ICPSR automatically generates this date for data collection additions and updates.

Examples:

"2020-07-20"
"2022-01-31"
Version Note

Description: Provenance information about a given version of the data collection.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"File CB3025.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads."
"The data producer provided additional data files."
"The codebook descriptions of variables TANSUP, EMOSUP, and SOCSUP were corrected."

Complete Version History Examples (with Subfields):

- "Version Number": "V2.1"
  "Version Date": "2025-10-03"
  "Version Note": "Updated study summary."

- "Version Number": "V2"
  "Version Date": "2023-08-12"
  "Version Note": "The data producer provided additional data files."

- "Version Number": "V1"
  "Version Date": "2021-03-01"
  "Version Note": "Initial release"
- "Version Number": "V1"
  "Version Date": "2024-06-28"
  "Version Note": "Initial release"

Distributors

Description: The organization(s) responsible for distributing the data collection.

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Organization Yes No Multi-part element; see subfields Name and other details about the organization that distributes the data collection.
Order Yes No Number The order of importance for the distributors of the data collection.
Organization

Description: Name and other details about the organization that distributes the data collection.

Required: Yes

Repeatable: No

Accepted Values: Multi-part element; for more information, see the Organization field

Order

Description: The order of importance for the distributors of the data collection.

Required: Yes

Repeatable: No

Accepted Values: Number

Usage Notes: A value of '0' indicates the primary distributor, '1' the second, and so forth.

Examples:

"0"
"1"
"2"

Complete Distributors Examples (with Subfields):

- "Organization":
    "Name": "Inter-university Consortium for Political and Social Research"
    "Ror": "https://ror.org/02q7mkh03"
  "Order": 0

- "Organization":
    "Name": "GESIS - Leibniz-Institute for the Social Sciences"
    "Ror": "https://ror.org/018afyw53"
  "Order": 1
- "Organization":
    "Name": "Roper Center for Public Opinion Research"
  "Order": 0

Study Number

Description: A unique, numerical value used by ICPSR to identify and track data collections.

Required: Yes

Repeatable: No

Accepted Values: Number

Usage Notes: The study number is automatically generated by ICPSR and is unique. Current study numbers are five or six digits, though four digit numbers were once standard and are still acceptable.

Examples:

"2760"
"3025"
"38672"

Digital Object Identifier (DOI)

Description: The registered persistent digital object identifier (DOI) associated with the data collection.

Required: Yes

Repeatable: No

Accepted Values: Text

Usage Notes: ICPSR Digital Object Identifiers (DOIs) are persistent identifiers provided by DataCite, a DOI registration agency. Each DOI (such as 'https://doi.org/10.3886/ICPSR39523.v1') has three components:

  1. https://doi.org – the DOI resolver, a web address used to look up a DOI and redirect to the resource
  2. 10.3886 – the DOI prefix, where '10' identifies the DOI system and '3886' is a unique registrant identifier for ICPSR
  3. 'ICPSR', the ICPSR study number, and then the version number (e.g., 'ICPSR39523.v1').

The study number is automatically generated by ICPSR and is unique. Current study numbers are five or six digits. Four-digit numbers were once standard and are still acceptable. Additionally, DOIs containing six-digit study numbers prepended with E, for example, https://doi.org/10.3886/E247464V1, were once used for studies self-published at ICPSR.

Study numbers with less than five digits will have zeroes prepended in the DOI (e.g., Study Number 4 is represented as 10.3886/ICPSR00004').

Examples:

"https://doi.org/10.3886/ICPSR300449.V2"
"https://doi.org/10.3886/ICPSR06425.v1"

Citation

Description: The official way to reference the data collection in writing.

Required: No

Repeatable: No

Accepted Values: Text

Usage Notes: The Citation is dynamically assembled from other entry fields in this format: PI (list). Title. Distributor (list), Issued Date. DOI. Note: ICPSR 'union catalog' records (i.e., external resource to which ICPSR links as a courtesy) do not have citations.

Examples:

"Sickmund, Melissa, Hockenberry, Sarah, and Puzzanchera, Charles M. National Juvenile Court Data Archive, United States, 1985-2019. Inter-university Consortium for Political and Social Research [distributor], 2022-07-28. https://doi.org/10.3886/ICPSR38418.v1"
"Institute of Museum and Library Services. Public Libraries in the United States Survey, 2016-2018. Inter-university Consortium for Political and Social Research [distributor], 2021-10-07. https://doi.org/10.3886/ICPSR37992.v1"

Person

Description: A person associated with an ICPSR data collection or service.

Required: No

Repeatable: No

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Personal Name Yes No Multi-part element; see subfields The person's name.
ORCID Identifier No No Text The person's Open Researcher and Contributor ID (ORCID).
Affiliation(s) No Yes Multi-part element; see subfields The person's affiliated organization(s).
Email Address No No Text The person's email address.
Personal Name

Description: The person's name.

Required: Yes

Repeatable: No

Accepted Values: Multi-part element; see subfields

Subfields:
Property Required? Repeatable? Accepted Values Description
Given Name (First Name) Yes No Text The person's first (given) name, which may include a middle name or initial.
Family Name (Last Name) Yes No Text The person's last (family) name, which may include a suffix (e.g., Jr., Sr., IV).
Given Name (First Name)

Description: The person's first (given) name, which may include a middle name or initial.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Miner P."
"Robert J."
"Claudia"
Family Name (Last Name)

Description: The person's last (family) name, which may include a suffix (e.g., Jr., Sr., IV).

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Marchbanks III"
"Shiller"
"Goldin"

Complete Personal Name Examples (with Subfields):

"Given Name (First Name)": "Susan B."
"Family Name (Last Name)": "Anthony"
"Given Name (First Name)": "John"
"Family Name (Last Name)": "Doe IV"
ORCID Identifier

Description: The person's Open Researcher and Contributor ID (ORCID).

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://orcid.org/0009-0006-2316-6486"
"https://orcid.org/0000-0003-3842-1604"
Affiliation(s)

Description: The person's affiliated organization(s).

Required: No

Repeatable: Yes

Accepted Values: Multi-part element; for more information, see the Organization field

Email Address

Description: The person's email address.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"j.doe@example.com"

Complete Person Examples (with Subfields):

"Personal Name":
  "Given Name (First Name)": "Robert J."
  "Family Name (Last Name)": "Shiller"
"ORCID Identifier": "https://orcid.org/0009-0006-2316-6486"
"Affiliation(s)":
- "Name": "Yale University"
  "Ror": "https://ror.org/03v76x132"
- "Name": "MacroMarkets"
"Personal Name":
  "Given Name (First Name)": "Claudia"
  "Family Name (Last Name)": "Goldin"
"ORCID Identifier": "https://orcid.org/0000-0003-3842-1604"
"Affiliation(s)":
- "Name": "Harvard University"
  "Ror": "https://ror.org/03vek6s52"
"Personal Name":
  "Given Name (First Name)": "Miner P."
  "Family Name (Last Name)": "Marchbanks III"

Organization

Description: An organization associated with an ICPSR data collection or service.

Required: No

Repeatable: No

Accepted Values: Multi-part element; see subfields

Subfields:

Property Required? Repeatable? Accepted Values Description
Organization Name Yes No Text The organization's name.
ROR Identifier No No Text The organization's Research Organization Registry (ROR) identifier.
Email Address No No Text The organization's email address.
Organization Name

Description: The organization's name.

Required: Yes

Repeatable: No

Accepted Values: Text

Examples:

"Federal Reserve Bank of St. Louis"
"University of Michigan"
ROR Identifier

Description: The organization's Research Organization Registry (ROR) identifier.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"https://ror.org/02q7mkh03"
Email Address

Description: The organization's email address.

Required: No

Repeatable: No

Accepted Values: Text

Examples:

"info@example.com"

Complete Organization Examples (with Subfields):

"Organization Name": "Urban Institute"
"ROR Identifier": "https://ror.org/017pz3h73"
"Email Address": "info@urban.institute"
"Organization Name": "Bureau of Justice Statistics"
"ROR Identifier": "https://ror.org/0006s4z66"
"Organization Name": "Internal Revenue Service"

ICPSR Metadata Schema Version History

Date Version Note
May 11, 2026 v1 Initial release and publication of the ICPSR Metadata Schema.