BEP guidelines

Based on the evergrowing set of BEPs and the respective work and efforts conducted to develop them, the community has identified a set of general guidelines that can be used to guide the corresponding processes.

These guidelines are not part of the BIDS specification, but rather are recommended to be followed when developing a BEP.

These guidelines are not set in stone and can be modified as needed. These guidelines are RECOMMENDED. The goal is to establish a basis of consensus to ease agile approval of BEPs that propose terms in line with these guidelines.

Generic guidelines

Get the community involved

Try to reach out to colleagues working with the type of data you are trying to add support for. The more people looking at your extension the better it will become through discussions.

Be consistent with the main specification

The main specification follows some general rules. For example, see the rules on participant labels.

Try not to deviate from BIDS conventions in your extension.

Avoid backward incompatible changes

BIDS is already incorporated in many tools. Proposing a change that would render already released BIDS datasets non-compliant may cause confusion and force developers to update their code. Such situations should be avoided when possible.

That said, breaking changes may eventually be necessary. If you have an idea that introduces a backwards-incompatible change, please add it as an issue to the BIDS 2.0 GitHub repository.

Use existing and common practices/formats

It is likely that certain data types are already stored in particular formats within your sub-field. If so, adopting these formats may help with community uptake. However, consistency with the BIDS specification takes priority.

In earlier versions of BIDS, some format choices—such as separate .bvec and .bval files for diffusion MRI—were made to accommodate legacy tools. These examples are not necessarily recommended for new BEPs. In fact, unified and structured formats like TSV or HDF5 would likely have been preferable.

Choosing file formats: downstream- vs upstream-looking

When proposing formats for storing new data types or metadata, BEP authors must consider BIDS' dual mission: not only to accurately represent acquired data, but also to enable scalable, transparent, and efficient reuse.

We distinguish between two broad types of file format orientation:

Downstream-looking formats (RECOMMENDED for both raw and derivative data)

These formats are designed with processing and data reuse in mind. They:

Support random access, chunking, and parallel I/O
Separate data and metadata clearly or use structured containers
Are widely supported in scientific computing and data science ecosystems (e.g., Python, R, Julia)
Facilitate validation, sharing, and cloud-based analysis
Are used across disciplines beyond neuroimaging, increasing accessibility

Under these principles, BIDS currently supports TSV, and JSON and formats such as Parquet are under consideration.

Even for raw BIDS data, such formats should be preferred—when appropriate—to make the generation of BIDS-Derivatives easier and more robust. By adopting these formats early, BIDS enables analysis tools that are modality-agnostic, interoperable, and future-proof.

Upstream-looking formats (USE WITH CAUTION)

These formats are optimized for the device's internal representation of data. They:

Frequently combine data and metadata into less transparent or binary containers
Require specialized, vendor-specific libraries
Reflect device-centric constraints (e.g., sampling order, encoding specifics)

The paradigmatic example of this format is DICOM. While upstream-looking formats are often necessary at the acquisition stage and may serve well for archival purposes, they are not ideal for BIDS representations, especially when open science and cross-discipline reuse are priorities.

If used at all in raw BIDS data, these formats should be accompanied by clear justifications, examples, and mappings to more general-purpose representations.

In summary:

Favor general-purpose, open formats used across scientific disciplines
Avoid redundant format options for the same data type
Do not assume legacy popularity justifies a format's inclusion
Plan for analysis and interoperability from the start, not as an afterthought

By orienting BIDS around downstream-compatible formats, we improve not only developer adoption, but also scientific reproducibility, modular pipeline construction, and accessibility for researchers outside niche modality communities.

Try to link with other existing standards and ontologies

There are many other standardization efforts that may inform your BEP. When possible, adopt terms or definitions from existing standards, or explicitly link to them. A good example is mapping BIDS metadata fields to DICOM tags.

Facilitate atomic changes

See issue #371 for motivation and discussion. It is recommended to isolate small, reusable changes (e.g., new metadata fields or entities) as separate PRs early in BEP development.

This allows review and reuse of terms across BEPs and can help streamline the review process by avoiding large, all-at-once PRs.

Limit flexibility, consider tool developers

Flexibility in design often comes at the expense of tool developer effort and standard interpretability. For example, allowing multiple file formats for the same data type means tool authors must account for each of them—possibly duplicating testing and maintenance work.

When in doubt, choose simplicity and clarity over configurability. BIDS should be predictable, not permissive, when it comes to how data are stored.

Make use of the the BIDS Schema

Working with the BIDS Schema will enable validation of your BEP. For more information on translating your BEP into the schema, visit the schema guide on this site.

Be consistent with other BEPs

A common dictionary (BIDS keys) is what makes BIDS successful, it is thus essential to not create many new entities. Many of the current BEPs have developed useful terms that we recommend here.

Existing entities

Entity	Description
desc-	Alphanumeric label, for any use, up to pipelines to determine what are valuable, for example, Common desc acronyms: lp30: low pass filtered at 30 Hz hp05: high-pass filtered at 0.5 Hz reref: re-references to another electrodemc: motion corrected sm: smoothed pvc: partial volume corrected McPvc: motion and partial volume corrected Note: concatenation of the above is possible, preferable in the order in which they were applied when applicable: like McPvc, RerefLp30. PascalCase is recommended when concatenating descriptions
space-	Name of space file is aligned to (standard or non-standard)
res-	Identifier for spatial resolution (details in sidecar)
den-	Identifier for mesh density (details in sidecar)
label-	Label of ROI described by mask file
hemi-{L\|R}	File describes left or right hemibrain
seg-	As per current atlas definition a label the user MAY use to distinguish a different segmentations, like `atlas/atlas-DKT_space-FSaverage.nii` `sub-01/sub-01_space-T1_seg-DKT_dseg.nii` (are there current uses of the `atlas` key that would be broken changing to `seg`?)

Proposed entities

Entity	BEP(s)	Description
model-	BEP016, BEP039	Name of model generating derivative file
param-	BEP016, BEP039	Name of parameter represented by file
atlas-	BEP003, BEP038	Atlas is defined as per Merrian-Webster, a bound collection of maps (i.e. labeled brain regions) and metadata (tables, or textual matter) like `atlas-x_space-MNI305_ext` or `atlas-DKT_ext`
group-	BEP039	Name of group combining over subjects
node-	No BEP (BEP-002 working implementation)	Name of processing node generating derivative file
stat-	BEP016 (contemplated but not currently present in proposal), also useful for atlas BEP38	The theory was that one could like computing the mean value across all values in a time series, or within a DWI shell, or the like. The particular aggregate statistic may not be an adequate descriptor; you could also need eg. the axis along which the aggregate was applied, which elements were or were not included in the aggregate... So it might be too much complexity to hand to a single entity?
meas-	BEP017,BEP23	Description of the quantity described by the file when the suffix is insufficient (eg. binding value, relaxation time)

Derivatives BEP and provenance

The objective of BIDS is to promote data sharing, ensuring that the information is easily accessible and reusable. For this purpose, it is highly recommended to provide comprehensive provenance information to ensure transparency and traceability While full provenance can be used for full reproducibility, it is not a pre-requirement. A suggested approach to developing the BEP involves envisioning each processing step, including potential file names and JSON structures. While this exercise might not precisely depict the eventual output files and JSON configurations, it's instrumental in capturing provenance and identifying what files or information should be retained for optimal reuse in future studies.

Note

See the FAQ on provenance

Common pitfalls

Relying on merging the extension on a set timeline

We have found it is very difficult to predict how long a BEP will take to merge into the standard. One challenge that has occurred in the past is a doctoral student requiring acceptance of their work as a requirement for graduation. We do not recommend yoking contributions to the BIDS community (or any volunteer-led open source community) to strict timelines to avoid the uncertainty around domain-specific community engagement, feedback from other BIDS contributors, and responding to reviews.

Not considering domain- or field-specific guidelines

In many neuroscience fields there have been past developments and efforts to implement standards, either formally or informally. If possible, BEPs should embrace these rather than trying to come up with alternative standards. The BEP should therefore inventorize and review past and existing work that may be relevant to the BEP.

Not considering DICOM fields

Many of the modalities we use have an associated standard, like DICOM for instance. While BIDS is not specifically about data format, a lot of metadata information are stored in data files and there is rarely a good reason for using a different name than one from other established standards. In using DICOM it is reasonable to check what DICOM has already developed and see if there is overlap. In a similar fashion, when relevant, we recommend having a sourcedata/ directory in example datasets to include DICOM files. You can delete the data and keep the header, removing any personally identifying information, also known as PII or "Personal Data" under the General Data Protection Regulation (GDPR).

Not building up a user community to support the BEP

Merging BEPs only happens following a community review. It is therefore helpful to get the stakeholders on board early (that is while writing the BEP) rather than at the review stage. Diversity in the team contributes to the quality of the extension proposal. We recommend that the core team has representatives from 3 different labs, preferably also with a mix of more junior and more senior contributors. You may also consider requesting explicit support letters from external labs.

Specific guidelines

If multiple BEPs need coordination, this document (section below) could be used to formulate guidelines for specific aspects to be followed by multiple BEPs.

Guidelines for spatial derivatives

Within this section, guidelines for developing BEPs that include spatial derivatives are outlined and motivated.

Problem statement

During the work on multiple BEPs that include spatial derivatives, a repeated pattern in generating derivatives within several imaging modalities' workflows was identified where:

A reference map that is used to encode spatial features and parameters is required. There is an antecedent of this in BIDS with BEP23 (see below). In that BEP, the proposed naming takes the pattern _<suffix>ref (for example _boldref, _dwiref...), and that solution has been suggested as a possibility in issue #1532 of the specification repository.
We have derived data that are no longer of the same type as the original, but for which we would like to keep the notion of the modality from which this was derived while also signaling that it is derived (that is non-raw).

Motivation for guidelines

Many users are not equipped to understand fine distinctions between different classes of derivatives (for example, those that are produced by a model fit and a direct computation).

Guidelines

A specific suffix pattern is used : _<suffix>map, where <suffix> is a BIDS suffix used in the raw data (for example, dwi or bold). For example, the proposed pattern produces the suffices _dwimap or _boldmap. BEPs may use this suffix pattern under the conditions specified below and MUST specify the extension and metadata that are required with the suffix.

The file descriptor does fall under one of the generic derivatives descriptors.
No other descriptor exists in the BIDS specification. For example, statmap cannot be used, because it is already being used, or soon to be, for a different specification.

This suffix pattern provides context through the concatenation of a raw data suffix and the word "map" which implies that the file still contains spatially contiguous information (in contrast to tabular/"tidy" data, with each row representing a brain region, for example).

This pattern is, in principle, generalizable across BEPs and derivatives in general:

A data process might have generated primary parameters that are either 3D (x,y,z) or 4D (x,y,z,v). These parameters might be of help for further data analysis or data interpretation, and ultimately the data end user. Examples include "statistics" such as mean, std, and so on, or model derivatives, such as DTI FA.
At the same time, the process might have generated secondary parameters. These are not strictly necessary for further processing or data interpretation, but they can be potentially useful to interpret the outputs of the data process, to track history of the processing, for reproducibility and ultimately for debugging purposes of the developer/modeler of the code.

<source_entities>_stat-<mean|std|...>_boldmap.nii.gz