Help & Documentation

Retrieve statistical results

The statistical-result REST API provides access to the detailed output generated as part of the statistical analysis process.

The API may be queried using:

  • phenotyping center (UCD, Wellcome Trust Sanger Institute, JAX, etc.)
  • phenotyping program (legacy MGP, EUMODIC, etc.)
  • phenotyping resource (EuroPhenome, MGP, IMPC)
  • phenotyping pipeline (EUMODIC1, EUMODIC2, MGP, IMPC adult, IMPC embryonic, etc.)
  • phenotyping procedure or parameter
  • allele name or MGI allele ID
  • strain name or MGI strain ID
  • gene symbol or MGI gene ID
  • any combination of these fields

The statistical-result REST API provides the fields described in the table below. Each field may be used for restricting the set of statistical-results you wish to receive.

The full SOLR select syntax is available for use in querying the REST API. See the Solr Wiki,  Solr Query Syntax and Common Query Parameters, for a more complete list of query options.

Field name

Datatype  

Description

doc_id

string

The ID of the solr document

db_id

int

The IMPC internal database identifier

data_type

string

The type of the underlying data for which the statistic was calculated

mp_term_id

string

The accession ID of the MP term associated to this result

mp_term_name

string

The name of the MP term associated to this result

top_level_mp_term_id

string

The accession ID of the top level MP term obtained by interrogating the MP ontology

top_level_mp_term_name

string

The name of the top level MP term obtained by interrogating the MP ontology

intermediate_mp_term_id

string

The accession ID of all intermediate level MP terms obtained by interrogating the MP ontology

intermediate_mp_term_name

string

The name of all intermediate level MP terms obtained by interrogating the MP ontology

phenotype_sex

list of strings

If the result produced a phenotype call, this lists all the sexes for which the call is significant

resource_name

string

The short name of the resource responsible for producing the data

resource_fullname

string

The full name of the resource responsible for producing the data

resource_id

int

The IMPC internal identifier of the resource responsible for producing the data

project_name

string

The consortium/project that produced the data

phenotyping_center

string

The center where the data was generated

pipeline_stable_id

string

IMPReSS pipeline identifier

pipeline_stable_key

string

IMPReSS pipeline stable key

pipeline_name

string

IMPReSS pipeline name

pipeline_id

int

IMPC internal ID representing the IMPReSS pipeline

procedure_stable_id

string

IMPReSS procedure stable ID

procedure_stable_key

string

IMPReSS procedure stable key

procedure_name

string

IMPReSS procedure name

procedure_id

int

IMPC internal ID representing the IMPReSS procedure

parameter_stable_id

string

IMPReSS parameter stable ID

parameter_stable_key

string

IMPReSS parameter stable key

parameter_name

string

IMPReSS parameter name

parameter_id

int

IMPC internal ID representing the IMPReSS parameter

colony_id

string

Phenotyping center specific colony name of the line used to generate the data

marker_symbol

string

Gene symbol

marker_accession_id

string

MGI ID of the gene

allele_symbol

string

Allele symbol

allele_name

string

Allele name

allele_accession_id

string

MGI ID of the allele

strain_name

string

Deprecated. Please see genetic_background description

     

strain_accession_id

string

The background strain MGI accession ID (or IMPC ID when MGI accession is not available)

genetic_background

string

The background strain name of the specimen

sex

string

The sex of the specimen

zygosity

string

The zygosity of the mutant specimen

control_selection_method

string

The strategy used to select the control set (options are baseline_all or concurrent)

dependent_variable

string

The variable being tested

metadata

list of strings

Metadata is data that describes the conditions under which the data was collected (e.g. machine calibration date)

metadata_group

string

A collection of biological specimens that were all tested under the same experimental conditions. The experimental conditions are identified by the metadata_group tag. For more information, see the IMPReSS parameter documentation section. Required For Data Analysis.

control_biological_model_id

int

IMPC internal ID of the biological model of the control group

mutant_biological_model_id

int

IMPC internal ID of the biological model of the experimental group

male_control_count

int

Count of male specimens in the control group

male_mutant_count

int

Count of male specimens in the experimental group

female_control_count

int

Count of female specimens in the control group

female_mutant_count

int

Count of female specimens in the experimental group

statistical_method

string

The statistical method used to calculate the P value

status

string

The status of the statistical calculation

additional_information

string

Any additional information about the calculation

raw_output

string

The actual R output produced while performing the calculation

p_value

float

The P value of the data

effect_size

float

The effect size of the data

categories

string

Categories of data [4]

categorical_p_value

float

The P value for categorical data [3]

categorical_effect_size

float

The effect size (max percentage change) for categorical data [3]

batch_significant

boolean

True/false if random variable “batch” is significant or not [1]

variance_significant

boolean

True/false if variance is significant [1]

null_test_p_value

float

The overall significance result of the calculation [1]

genotype_effect_p_value

float

The significance of the genotype effect to describe variation in the data [1]

genotype_effect_stderr_estimate

float

The estimate of the standard error of the genotype effect [1]

genotype_effect_parameter_estimate

float

The effect size estimate of the genotype effect [1]

sex_effect_p_value

float

The significance of sex to describe variation in the data [1]

sex_effect_stderr_estimate

float

The estimate of the standard error of the sex effect [1]

sex_effect_parameter_estimate

float

The effect size estimate of the sex effect [1]

weight_effect_p_value

float

The significance of weight to describe variation in the data [1]

weight_effect_stderr_estimate

float

The estimate of the standard error of the weight effect [1]

weight_effect_parameter_estimate

float

The effect size estimate of the weight effect [1]

group_1_genotype

string

The genotype of the first group (usually +/+)

group_1_residuals_normality_test

float

Significance that group 1 conforms to normal distribution

group_2_genotype

string

The genotype of the second group (usually the colony ID)

group_2_residuals_normality_test

float

Significance that group 2 conforms to normal distribution

blups_test

float

Best Linear Unbiased Prediction test

rotated_residuals_test

float

Additional statistical model details. See PhenStat documentation for more information

intercept_estimate

float

Additional statistical model details. See PhenStat documentation for more information

intercept_estimate_stderr_estimate

float

Additional statistical model details. See PhenStat documentation for more information

interaction_significant

boolean

True/false if the significance of sex*genotype interaction is significant [1]

interaction_effect_p_value

float

The significance of sex*genotype interaction to describe variation in the data [1]

female_ko_effect_p_value

float

If sex is significant, the significance of the female genotype [1]

female_ko_effect_stderr_estimate

float

If sex is significant, the standard error estimate of the female genotype [1]

female_ko_parameter_estimate

float

If sex is significant, the effect size estimate of the female genotype [1]

male_ko_effect_p_value

float

If sex is significant, the significance of the male genotype [1]

male_ko_effect_stderr_estimate

float

If sex is significant, the standard error estimate of the male genotype [1]

male_ko_parameter_estimate

float

If sex is significant, the effect size estimate of the male genotype [1]

classification_tag

string

A summary of the result [1]

[1] – For unidimensional parameters [2] – For multidimensional parameters [3] – For time series parameters [4] – For categorical parameters [5] – For metadata parameters

 

Command line examples

Retrieve all statistical-result calculations

This is the basic request to get the first 10 results from the Solr service in JSON:

curl \       
--basic \       
-X GET \       
'https://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=*:*&rows=10&wt=json&indent=true'

Note that the number of rows have been limited to 10.

A bit of explanation:

  • statistical-result is the name of the Solr core service to query
  • select is the method used to query the Solr REST interface
  • q=*:* means querying everything without any filtering on any field
  • rows allows to limit the number of results returned.
  • wt=json return the results in json format (alternatives are “csv” and “xml”)
  • indent=1 or indent=true indents the output into a more human-readable form

Retrieve all IMPC statistical results for a specific marker

We will constrain the results by adding a condition to the q (query) parameter using the specific marker_symbol field. For Akt2, simply specify q=marker_symbol:Akt2

curl \       
--basic \       
-X GET \       
'https://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=marker_symbol:Akt2&wt=json&indent=true'

More examples of how to use Solr to query and filter are available on the genotype-phenotype API documentation page.

URL examples

The following are example URLs that you can paste into your browsers location bar. The resulting documents will display like a web page.

NOTE: Certain characters, most notably spaces and the “<” and “>” characters, must be url encoded (space = %20, < = %3c, > = %3e) for command line usage.

 

Get a maximum of 500 statistical results for gene Car4

https://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=marker_symbol:Car4&rows=500

 

Get a maximum of 500 statistical results for gene Car4 in CSV format

https://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=marker_symbol:Car4&rows=500&wt=csv

 

Get a list of specific fields from statistical results for gene Car4 in CSV format

https://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=marker_symbol:Car4&wt=csv&fl=parameter_stable_id,parameter_name,female_control_count,female_mutant_count,male_control_count,male_mutant_count,null_test_p_value,statistical_method,allele_name,status,marker_accession_id,marker_symbol,allele_accession_id,allele_symbol,colony_id,metadata_group,zygosity,strain_accession_id,strain_name,classification_tag,genotype_effect_p_value,effect_size,genotype_effect_stderr_estimate

The IMPC Newsletter

Get highlights of the most important data releases, news and events, delivered straight to your email inbox

Subscribe to newsletter