REST API documentation for IMPC and Legacy statistical result access

The statistical-result REST API provides access to the detailed output generated as part of the statistical analysis process.

The API may be queried using

The statistical-result REST API provides the fields described in the table below. Each field may be used for restricting the set of statistical-results you wish to receive. The full SOLR select syntax is available for use in querying the REST API. See http://wiki.apache.org/solr/SolrQuerySyntax and http://wiki.apache.org/solr/CommonQueryParameters for a more complete list of query options.

Field name Datatype Description
doc_id string The ID of the solr document
db_id int The IMPC internal database identifier
data_type string The type of the underlying data for which the statistic was calculated
mp_term_id string The accession ID of the MP term associated to this result
mp_term_name string The name of the MP term associated to this result
top_level_mp_term_id string The accession ID of the top level MP term obtained by interrogating the MP ontology
top_level_mp_term_name string The name of the top level MP term obtained by interrogating the MP ontology
intermediate_mp_term_id string The accession ID of all intermediate level MP terms obtained by interrogating the MP ontology
intermediate_mp_term_name string The name of all intermediate level MP terms obtained by interrogating the MP ontology
phenotype_sex list of strings If the result produced a phenotype call, this lists all the sexes for which the call is significant
resource_name string The short name of the resource responsible for producing the data
resource_fullname string The full name of the resource responsible for producing the data
resource_id int The IMPC internal identifier of the resource responsible for producing the data
project_name string The consortium/project that produced the data
phenotyping_center string The center where the data was generated
pipeline_stable_id string IMPReSS pipeline identifier
pipeline_stable_key string IMPReSS pipeline stable key
pipeline_name string IMPReSS pipeline name
pipeline_id int IMPC internal ID representing the IMPReSS pipeline
procedure_stable_id string IMPReSS procedure stable ID
procedure_stable_key string IMPReSS procedure stable key
procedure_name string IMPReSS procedure name
procedure_id int IMPC internal ID representing the IMPReSS procedure
parameter_stable_id string IMPReSS parameter stable ID
parameter_stable_key string IMPReSS parameter stable key
parameter_name string IMPReSS parameter name
parameter_id int IMPC internal ID representing the IMPReSS parameter
colony_id string Phenotyping center specific colony name of the line used to generate the data
marker_symbol string Gene symbol
marker_accession_id string MGI ID of the gene
allele_symbol string Allele symbol
allele_name string Allele name
allele_accession_id string MGI ID of the allele
strain_name string Deprecated. Please see genetic_background description
strain_accession_id string The background strain MGI accession ID (or IMPC ID when MGI accession is not available)
genetic_background string The background strain name of the specimen
sex string The sex of the specimen
zygosity string The zygosity of the mutant specimen
control_selection_method string The strategy used to select the control set (options are baseline_all or concurrent)
dependent_variable string The variable being tested
metadata list of strings Metadata is data that describes the conditions under which the data was collected (e.g. machine calibration date)
metadata_group string A collection of biological specimens that were all tested under the same experimental conditions. The experimental conditions are identified by the metadata_group tag. For more information, see the IMPReSS parameter documentation section Required For Data Analysis.
control_biological_model_id int IMPC internal ID of the biological model of the control group
mutant_biological_model_id int IMPC internal ID of the biological model of the experimental group
male_control_count int Count of male specimens in the control group
male_mutant_count int Count of male specimens in the experimental group
female_control_count int Count of female specimens in the control group
female_mutant_count int Count of female specimens in the experimental group
statistical_method string The statistical method used to calculate the P value
status string The status of the statistical calculation
additional_information string Any additional information about the calculation
raw_output string The actual R output produced while performing the calculation
p_value float The P value of the data
effect_size float The effect size of the data
categories string Categories of data [4]
categorical_p_value float The P value for categorical data [3]
categorical_effect_size float The effect size (max percentage change) for categorical data [3]
batch_significant boolean True/false if random variable "batch" is significant or not [1]
variance_significant boolean True/false if variance is significant [1]
null_test_p_value float The overall significance result of the calculation [1]
genotype_effect_p_value float The significance of the genotype effect to describe variation in the data [1]
genotype_effect_stderr_estimate float The estimate of the standard error of the genotype effect [1]
genotype_effect_parameter_estimate float The effect size estimate of the genotype effect [1]
sex_effect_p_value float The significance of sex to describe variation in the data [1]
sex_effect_stderr_estimate float The estimate of the standard error of the sex effect [1]
sex_effect_parameter_estimate float The effect size estimate of the sex effect [1]
weight_effect_p_value float The significance of weight to describe variation in the data [1]
weight_effect_stderr_estimate float The estimate of the standard error of the weight effect [1]
weight_effect_parameter_estimate float The effect size estimate of the weight effect [1]
group_1_genotype string The genotype of the first group (usually +/+)
group_1_residuals_normality_test float Significance that group 1 conforms to normal distribution
group_2_genotype string The genotype of the second group (usually the colony ID)
group_2_residuals_normality_test float Significance that group 2 conforms to normal distribution
blups_test float Best Linear Unbiased Prediction test
rotated_residuals_test float Additional statistical model details. See PhenStat documentation for more information
intercept_estimate float Additional statistical model details. See PhenStat documentation for more information
intercept_estimate_stderr_estimate float Additional statistical model details. See PhenStat documentation for more information
interaction_significant boolean True/false if the significance of sex*genotype interaction is significant [1]
interaction_effect_p_value float The significance of sex*genotype interaction to describe variation in the data [1]
female_ko_effect_p_value float If sex is significant, the significance of the female genotype [1]
female_ko_effect_stderr_estimate float If sex is significant, the standard error estimate of the female genotype [1]
female_ko_parameter_estimate float If sex is significant, the effect size estimate of the female genotype [1]
male_ko_effect_p_value float If sex is significant, the significance of the male genotype [1]
male_ko_effect_stderr_estimate float If sex is significant, the standard error estimate of the male genotype [1]
male_ko_parameter_estimate float If sex is significant, the effect size estimate of the male genotype [1]
classification_tag string A summary of the result [1]
[1] - For unidimensional parameters [2] - For multidimensional parameters [3] - For time series parameters [4] - For categorical parameters [5] - For metadata parameters

Command line examples

Retrieve all statistical-result calculations

This is the basic request to get the first 10 results from the Solr service in JSON

        curl \
        --basic \
        -X GET \
        'http://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=*:*&rows=10&wt=json&indent=true'
        

A bit of explanation:

Retrieve all IMPC statistical results for a specific marker

We will constrain the results by adding a condition to the q (query) parameter using the specific marker_symbol field. For Akt2, simply specify q=marker_symbol:Akt2

        curl \
        --basic \
        -X GET \
        'http://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=marker_symbol:Akt2&wt=json&indent=true'
        

More examples of how to use Solr to query and filter are available on the genotype-phenotype API documentation page.

URL examples

The following are example URLs that you can paste into your browsers location bar. The resulting documents will display like a web page.

NOTE: Certain characters, most notably spaces and the "<" and ">" characters, must be url encoded (space = %20, < = %3c, > = %3e) for command line usage.

Get a maximum of 500 statistical results for gene Car4

http://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=marker_symbol:Car4&rows=500

Get a maximum of 500 statistical results for gene Car4 in CSV format

http://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=marker_symbol:Car4&rows=500&wt=csv

Get a list of specific fields from statistical results for gene Car4 in CSV format

http://www.ebi.ac.uk/mi/impc/solr/statistical-result/select?q=marker_symbol:Car4&wt=csv&fl=parameter_stable_id,parameter_name,female_control_count,female_mutant_count,male_control_count,male_mutant_count,null_test_p_value,statistical_method,allele_name,status,marker_accession_id,marker_symbol,allele_accession_id,allele_symbol,colony_id,metadata_group,zygosity,strain_accession_id,strain_name,classification_tag,genotype_effect_p_value,effect_size,genotype_effect_stderr_estimate