PubChem BioAssay Description CSV Tags
The PubChem Deposition Gateway allows BioAssay depositors to use spreadsheet files to
upload assay description information. Note that this is independent of the
CSV file which is accepted to upload BioAssay
data. By using CSV files
to upload assay description information, one can conveniently prepare an upload
inside popular spreadsheet programs without the need to convert to other file formats.
There is, however, a requirement that the description information be organized
in specific ways so that our system can recognize and better validate it. The
principal way this is done is through standard header tags that must be used at
the top of each column.
The following is a listing of the accepted CSV file formats,
including header requirements and appropriate value tags.
§Header tag must be in the first column and is mandatory.
*Header tag is mandatory.
1. General Description Items

This CSV file defines individual items in the description such as the Name
and External RegId of the assay.
Example Usage:

Headers:
Accepted DESCR_TAGs:
- PUBCHEM_EXT_DATASOURCE_REGID
- PUBCHEM_ASSAY_NAME
- PUBCHEM_ASSAY_DESCRIPTION
- PUBCHEM_ASSAY_PROTOCOL
- PUBCHEM_ASSAY_COMMENTS
- PUBCHEM_GRANT_NUMBER
- PUBCHEM_HOLD_UNTIL_DATE
- Accepted format of values for this tag: YYYY/MM/DD, like 2000/12/31. An alternative form
of YYYY-MM-DD is also accepted.
- PUBCHEM_PROJECT_CATEGORY
- Accepted values for this tag: NIH_MLSCN, NIH_MLPCN,NIH_MLSCN_ASSAY_PROVIDER,
NIH_MLPCN_ASSAY_PROVIDER, JOURNAL_ARTICLE, ASSAY_VENDOR,
LITERATURE_EXTRACTED, LITERATURE_AUTHOR, LITERATURE_PUBLISHER,
RNAI_GLOBAL_INITIATIVE, OTHER
- PUBCHEM_ACTIVITY_OUTCOME_METHOD
- Accepted values for this tag: OTHER,
PRIMARY_SCREENING, CONFIRMATORY, SUMMARY
- PUBCHEM_SUBSTANCE_TYPE
- Accepted values for this tag:
SMALL_MOLECULE, NUCLEOTIDE
- PUBCHEM_ASSAY_PANEL_NAME
- PUBCHEM_ASSAY_PANEL_DESCRIPTION
2.
Result Definitions (TIDs)
This CSV file defines the result definitions (TIDs) for which you will be providing data
(in a later step of the submission process). The format of this file has changed to make it
more convenient, but the previous format is
still accepted.
Example Usage:

Column Headers:
- PUBCHEM_RESULT_TAG§
- Submitter-defined Result Names (1 per column)*
Row Headers (optional, starting from Row 2):
- RESULT_TYPE*
- Accepted values for this tag: FLOAT, INTEGER, BOOLEAN,
STRING, PUBCHEM_NCBI_PUBMED_ID, PUBCHEM_NCBI_MMDB_ID, PUBCHEM_EXT_URL,
PUBCHEM_NCBI_PROTEIN_GI, PUBCHEM_NCBI_NUCLEOTIDE_GI,
PUBCHEM_NCBI_TAXONOMY_ID, PUBCHEM_NCBI_OMIM_ID, PUBCHEM_NCBI_GENE_ID,
PUBCHEM_NCBI_PROBE_ID, PUBCHEM_AID, PUBCHEM_SID, PUBCHEM_CID,
TARGET_NCBI_PROTEIN_GI, TARGET_NCBI_BIOSYSTEMS_ID, TARGET_NAME,
TARGET_DESCRIPTION, TARGET_NCBI_TAXONOMY_ID, TARGET_NCBI_GENE_ID,
TARGET_NCBI_DNA_GI, TARGET_NCBI_RNA_GI
- RESULT_DESCR
- RESULT_UNIT
- Accepted values for this tag: PPT, PPM, PPB,
MILLIMOLAR, MICROMOLAR, NANOMOLAR, PICOMOLAR, FEMTOMOLAR,
MILLIGR_PER_ML, MICROGR_PER_ML, NANOGR_PER_ML, PICOGR_PER_ML, FEMTOGR_PER_ML,
MOLAR, PERCENT, RATIO, SECONDS, RECIPROCAL_SECONDS, MINUTES, RECIPROCAL_MINUTES,
DAYS, RECIPROCAL_DAYS, ML_MIN_KG, L_KG, HR_NG_ML, CM_SEC, MG_KG,
OTHER, NONE, UNSPECIFIED
- RESULT_ATTR_CONC_MICROMOL
- RESULT_CONC_RESPONSE_SERIES_ID
- RESULT_IS_ACTIVE_CONCENTRATION
- Use 1 for true or leave blank otherwise.
3. Target Definitions
This CSV file defines one or more targets of the BioAssay.
Example Usage:

Headers:
- TARGET_TYPE§
- TARGET_ID*
- TARGET_DESCRIPTION
- TARGET_COMMENT
- TARGET_NAME
Accepted TARGET_TYPEs:
- PUBCHEM_NCBI_PROTEIN_GI
- PUBCHEM_NCBI_DNA_GI
- PUBCHEM_NCBI_RNA_GI
- PUBCHEM_NCBI_GENE_ID
- PUBCHEM_NCBI_BIOSYSTEMS_ID
4. XRefs (Cross References to Other Databases)
This CSV file provides XRefs, cross references to relevant information in
other databases. One example would be PubChem AIDs of other related BioAssays.
Example Usage:

Headers:
- XREF_TYPE§
- XREF_VALUE*
- XREF_ANNOTATION
- XREF_TYPE_QUALIFIER (Accepted Value: PRIMARY_CITATION)
Use this only for PubMed-Id type to indicate it is a primary citation directly
relevant to the assay.
Accepted XREF_TYPEs:
- PUBCHEM_AID
- PUBCHEM_EXT_DATASOURCE_REGID
- PUBCHEM_NCBI_PUBMED_ID
- PUBCHEM_NCBI_MESH_TERM
- PUBCHEM_GENBANK_GENERIC_ID
- PUBCHEM_NCBI_MMDB_ID
- PUBCHEM_SID
- PUBCHEM_CID
- PUBCHEM_EXT_DATASOURCE_URL
- PUBCHEM_EXT_SUBSTANCE_URL
- PUBCHEM_EXT_ASSAY_URL
- PUBCHEM_NCBI_PROTEIN_GI
- PUBCHEM_NCBI_NUCLEOTIDE_GI
- PUBCHEM_NCBI_TAXONOMY_ID
- PUBCHEM_NCBI_OMIM_ID
- PUBCHEM_NCBI_GENE_ID
- PUBCHEM_NCBI_BIOSYSTEMS_ID
- PUBCHEM_REGNUM
5. Categorized Comments
Currently, the mechanism for adding/modifying Categorized Comments is via the
Load from CSV link on the Assay Description form or via the Upload Description File page.
The Categorized Comments CSV file allows for the definition of user-defined TAG-VALUE pairs which
are stored in the assay record as comments. All such comments will be searchable
in PubChem and provide a convenient placeholder for user-defined ontologies or
other definitions outside the scope of the PubChem specification.
Example Usage:

Headers:
- CAT_COMMENT_TAG§
- CAT_COMMENT_VALUE*
6. Panel Assay Information
This CSV file defines the Panel Member information for a Panel Assay. Note that this
CSV file is currently input via the separate, dedicated Panel Components Upload dialog.
Headers:
- PANEL_ID§
- PANEL_NAME
- PANEL_DESCRIPTION
- PANEL_PROTOCOL
- PANEL_COMMENT
- PANEL_TARGET_NAME
- PANEL_TARGET_ID
- PANEL_TARGET_TYPE
- PANEL_ACT_OUTCOME_METHOD
- PANEL_TID_NAMES_REGULAR
- PANEL_TAXONOMY
- PANEL_GENE
- PANEL_TID_NAMES_OUTCOME
- PANEL_TID_NAMES_SCORE
- PANEL_TID_NAMES_AC
|