PubChem Assay Tags


The external assay identifier assigned by you, the assay owner. This must be unique amongst all of your PubChem assays.


A short, informative name of the assay for display purposes.


A definition of the assay purpose and parameters.


The protocol used to generate the assay. This might include an explanation of how the Activity Outcome and Score values in the Assay Data were determined.


Additional information that might not fit in the Description or Protocol sections.


These are Tag-Value pairs which provide a convenient place-holder for the definition of submitter-defined ontologies or other definitions outside the scope of the PubChem specification. All such comments will be searchable in PubChem. The tutorial includes an example of Tag-Value pairs.


For any assay designed to identify chemicals interacting with a target such as an enzyme inhibitor, please specify the sequence identifier here. For a chemical assay, the target is typically a protein, but it can also be a gene, nucleotide or pathway id. Cell-based assays can skip this field.

Note that only 1 or a couple of targets should be identified here. If you have something like an RNAi assay, target definitions which change for each tested result should be specified in a column within the Assay Data.


Cross-references (XRefs) can be made to many NCBI database records related to your assay. This includes other PubChem BioAssays either by AID number or RegId, but it also includes PubMed Ids, Taxonomy Ids, and many other databases.

Note, please do not duplicate the protein identifier if used in the Target section.


For most assay submissions, the assay data contains the actual data reported for each tested substance. The submitter may define as many columns as desired reporting numerical values, such as IC50s and Percent Inhibition, but also labels and database identifiers.

One column of activity outcome values must be reported to give a submitter-defined judgment call on whether each row should be considered inactive (1) or active (2).

Please consult our help documentation for data specification and sample files.


PubChem expands the bioassay data model to support the presentation and annotation of profiling screening results.

Panel assays are very complex in nature and we have tried to make the interface as user-friendly as possible. Please remember, however, that extra attention should be paid to panel assay definitions and data to ensure their accuracy.

To see a panel assay example, please take a look at this kinase profiling assay.


Please indicate the type of substances tested in your assay to help categorize assays into chemical and RNAi types, for example.


Classify your assay by how the activity outcome was defined. Choices include:

Screening assay - Single concentration activity observed.

Confirmatory assay - Concentration-Response Relationship Observed (EC50,IC50,etc.)

Summary assay - Overview of and links to multiple, related assays.

Other - Assay does not fall into the above categories.


Classify your assay if a specialized project was used for its creation. If none of these apply to you, please choose 'Other'.

Literature, Extracted - Select if assay data extracted from literature by 3rd party (not by author or article publisher).

Literature, Author / Publisher - Select if assay data extracted from article by author or by publisher.

RNAi Global Initiative - Select if work is from a member of the RNAi Global Initiative.

Assay Vendor - Select if contributed by an assay service provider.

NIH Molecular Libraries - Select if an assay experiment was funded by the relevant NIH Molecular Libraries program.


A grant number can be specified. Note that this string is not validated.


A label to be added to multiple assays for the purpose of logically grouping them.


Optional hold-until date to delay public access of assay data in PubChem. This may be useful, for example, to coordinate release of data with a journal publication.

Note that your access to the data will be restricted until that date via your PubChem Upload account.

For more information, please consult our help documentation about data release.


If you have previously deposited your Substance description into PubChem, you may use your Substance identifier (SID) assigned by PubChem. This must be an unsigned integer value and, in nearly all cases, your organization must have deposited the Substance associated with this SID.


You may use your own identifier for Substance descriptions previously loaded into PubChem.


This field allows the submitter to make an expert judgment call about the activity of each test result. Using a number, the value is set to 1 (inactive) or 2 (active) based on whatever means appropriate. An explanation of that determination should be provided in the Protocol or Comments section of the Assay Description.

In addition to active/inactive, this field can also be set to 3 (inconclusive), 4 (unspecified) or 5 (probe). The 'probe' designation indicates that the activity of the test result has been tested and confirmed though multiple rounds of experimental inquiry.


The activity of a test result may be assigned a normalized score between 0 and 100 where the most active result rows have scores closer to 100 and inactive closer to 0, so that one can rank the result based on this data and prioritize hits.


An URL may optionally be provided for Assay Data reported for this Substance in this column. This URL will be provided within PubChem displays to allow a PubChem user to link to your website, where you may choose to provide additional information or interfaces to your Assay Data, for example, dose-response curves, replicate data, etc.


Your textual annotation and comments may optionally be provided for Assay Data reported for this Substance in this column.


When you submit the data you must leave this blank or put a value '0' in this column. You may optionally suppress Assay Data for this Substance by putting a value of "1" in this column. In this case, leave all other columns blank except for Column 1: PUBCHEM_SID. Suppressing Assay Data does not delete data from PubChem, rather it eliminates all references and links to this information; however, all pre-existing links to this information will still function and a disclaimer will be displayed specifying this data is revoked.

You may un-revoke Assay Data for a Substance by depositing either the same or new data for this Substance. Do not revoke and submit the same substance in the same file.


Define your own result definition here, one per column. You must give it a name and you can also specify parameters like the data type and unit. For example if you want to report an EC50, you can name it "EC50", set the data type to "FLOAT" and the unit to "MICROMOLAR".

General Description Items


A table column header for general description tags.


A table column header for general description values.

Result Definitions Items


This header goes in the first row, first column of the spreadsheet. Immediately under it are optional tags to define properties of result definitions, such as RESULT_UNIT. In all data rows below that, this column contains an increasing number starting from one.


The result type typically is either a Float, Integer, Boolean or String.

Optionally, the type can be used to specify an identifer, such as one coming from another NCBI Entrez database. For example, if PubMed Id is chosen as the type, then all data values in this column will be checked to ensure that they are valid PubMed identifiers.


Various units are available to better define the measurement of a given result column.


An optional description to explain what is being measured for a given result column.


An optional micromolar concentration at which this result was tested. This attribute implies that the result is biological concentration-response data.


For confirmatory assays, an optional id starting from 1 to group columns into series for defining dose-response curves. If one series is defined, all columns in that series will have a '1' in this field. A second series would use a '2' and so forth.


For confirmatory assays, this column allows an optional "1" for the one result column that summarized the active concentration. This is typically reported as an IC50, EC50, AC50, GI50 etc. or by reporting constant parameters such as Ki

Categorized Comments Items


A submitter-defined tag to define a categorized comment. This tag column must appear as the first column of the spreadsheet.


The value of a submitter-defined categorized comment.

Target Items


The database type of target identifier supplied.


The required, first table column header for target data. The values in this column are the actual primary identifiers (numbers) from one of the accepted NCBI databases.


The optional name of the target. If left blank, a standard name from the sequence database will be used where possible.


Any additional description of the target beyond its name.


Any additional comments or annotations for the target.

XREF Items


The database type of XRef identifier supplied. This type column must appear as the first column of the spreadsheet.


The actual identifer value from the cross-referenced NCBI database.


An explanatory text describing the relevance of this cross-referenced item to the assay.



Integer (from 1) that is the same for one or more result description columns, thereby grouping them together. Alternatively, a panel type can also be added after the number, like 1_REGULAR, 1_OUTCOME, 1_SCORE or 1_AC. By default, a plain integer is interpreted as the regular type.


Short name of panel component.


Short description about specifics of panel component, such as about cell line, or target information.


Specific procedure used to generate results for the panel.


Additional information.


Not necessary to provide - this will be filled in automatically unless you provide a value.


This is mandatory if any of the target fields are present.


This is mandatory if any of the target fields are present. It is an integer: Protein(1), DNA(2), RNA(3), Gene(4), BioSystems(5).


NCBI Taxonomy-id (integer).


NCBI Gene-id (integer).


Assay outcome qualifier (integer). Choices include screening (1), confirmatory(2), summary(3) and other(0).