About Psych-DS¶
Psych-DS is a community data standard for research in psychology and other behavioral sciences, which provides a flexible set of conventions for formatting and documenting scientific datasets. It is heavily inspired by the Brain Image Data Structure (BIDS) standard for fMRI data.
What is Psych-DS?¶
Psych-DS provides a simple and easy-to-adopt standard for organizing data in the psychological and behavioral sciences, which aims to help researchers satisfy FAIR principles for data sharing.
Key Goals
- To promote the adoption of good, consistent practices in the management of behavorial data
- To create a machine-readable format for these datasets that can support tools for their analysis, discovery, and preparation
Why do I need Psych-DS?¶
In the social and behavioral sciences:
- Datasets can be arranged in many different ways and use various file formats
- No consensus exists about how to organize and share project data
- Even researchers within the same lab may arrange data differently
- Lack of standardization leads to:
- Miscommunications
- Time wasted on reformatting/rearranging data
- Difficulties indexing datasets for search tools
- Challenges in writing reusable analysis scripts
Getting Started with Psych-DS¶
Documentation Resources
- Getting Started Guide: Step-by-step guidance for creating your first Psych-DS dataset
- Rules and Conventions: Basic requirements for Psych-DS compliance
- Advanced Practices: Guidance on more advanced topics relating to metadata and file structure
- Error reference: Descriptions of and troubleshooting tips for all common errors
- Schema Reference: Documentation built from technical schema reference, includes all rules/objects/definitions
- Technical Reference: Official schema model using linkML
Core Components¶
To be compliant with Psych-DS, focus on two key aspects:
1. Metadata
Understanding Metadata¶
Metadata provides rich contextual information about a dataset, including:
- Summary of contents
- Creation and modification records
- Essential context regarding the provenance/design of the study
Without Metadata:¶
Without standardized metadata:
- Context is provided on an ad-hoc basis
- Information lives in email communication or separate documentation
- Sharing requires re-explaining context
- Machine readability is impossible
Traditional Email-based Sharing
With Metadata¶
Standardized metadata provides:
- Permanent attachment to data
- Consistent information structure
- Machine readability
- Integration with semantic web standards
Key features:
- Uses JSON-LD formatting
- Integrates with Schema.org vocabulary
- Supports both minimal and comprehensive documentation
Structured Metadata Example
{
"@context": "https://schema.org",
"@type": "Dataset",
"name": "X Experiment",
"author": {
"@type": "Person",
"name": "Test Researcher",
"@id": "https://orcid.org/0022-0002-3833-3472"
},
"description": "A self-paced reading study with N participants...",
"funding": {
"@type": "Grant",
"@id": "https://dx.doi.org/10.1080/02626667.2018.1560449",
"name": "Y Grant"
},
"locationCreated": {
"@type": "Place",
"name": "Z facility",
"address": "123 Main St..."
}
}
2. File Organization
The Challenge of Unstructured Data¶
Without standardization:
- Files may be scattered across directories
- Naming conventions vary widely
- Mixed formats and processing states create confusion
Unstructured Dataset Example
Psych-DS File Structure¶
Key requirements:
- Dedicated
data/subdirectory - CSV format for data files
_datasuffix in filenames- "Keyword" formatting for file properties
Structured Dataset Example
Validation Tools¶
The Psych-DS team provides validation tools across multiple platforms:
- Browser-based (best option for most researchers)
- npm package (best option for developers)
- Python library (coming soon)
- R package (coming soon)
Features
- Binary VALID/INVALID output
- Detailed error reporting
- Optional warning flags
- Client-side processing for privacy
Privacy Commitment
All validation is performed locally. No data is uploaded or stored during validation. The browser-based tool uses client-side JavaScript exclusively, with no server interaction or database storage.