Skip to content

Datafile

Definition:

A CSV file under the /data directory in which the official psych-DS compliant data from the dataset is stored. Datafiles must follow Psych-DS file naming conventions, which includes the use of keyword formatting, the '_data' suffix, and the '.csv' extension. An example of a valid datafile might be 'study-123_site-lab4_data.csv'. In the future, more official suffices and extensions may be made available. A controlled list of official keywords is provided, but the use of unofficial keywords is permitted, so long as they are clearly defined and used consistently within a research community.

Properties:

Property Value Description
requires data Set of schema locations defining the objects that must be present for certain issues to be reported
suffix data String following the final '_' in a filename and preceding the '.' of the extension. Used to identify datafiles primarily.
extensions ['.csv'] Extension of current file including initial dot
baseDir data Name of the directory under which the file object is expected to appear.
arbitraryNesting True Indicator for whether a given file object is allowed to be nested within an arbitrary number of subdirectories.
columnsMatchMetadata True Each datafile must only use column headers that appear in the 'variableMeasured' property of the compiled metadata object that corresponds to it.
usesKeywords True Indicator for whether a given file object requires keyword formatting.
nonCanonicalKeywordsAllowed True Indicator for whether a given file object is required to use only official Psych-DS keywords
fileRegex ([a-z]+-[a-zA-Z0-9]+)(_[a-z]+-[a-zA-Z0-9]+)*_data\.csv Regular expression defining the legal formatting of a filename.

If object not found:

Property Value
code MISSING_DATAFILE
level error
reason No CSV files were found in the data subdirectory (or all of the CSV files found there had a problem - see other error messages.) There must be at least one valid csv datafile under the data/ subdirectory.