SNPExpress

Introduction

SNPExpress database

A project has been launched to investigate tissue-specific genetic regulation of splicing and expression in two human tissues. This project included 93 brain samples and 80 PBMC ( Peripheral Blood Mononuclear Cell ) samples. Both types of samples were genotyped using Illumina HumanHap 550K BeadChip, and expression levels at transcript and exon levels were measured using Affymetrix Human ST 1.0 chip. This then enables us to answer the question: what SNPs can influence the expression and splicing of human genes (Figure 1)?

Figure 1

SNPExpress is a database and its user-interface to store these genotype and expression data, and to enable a convenient search for genes and SNPs in the context of these data. Particularly, using these data an analytical project focusing on ‘cis -acting genetic regulation' has been conducted. And the association results have been fully implemented into the SNPExpress database. In this project a 'cis-acting genetic regulation' was defined as associations between SNPs lying in 100kb surrounding region of the transcripts or exons (Figure 2). Each of these associations were performed using a linear regression model corrected for age, sex, source of tissues, and curated EIGENSTRAT axes.

Figure 2

SNPExpress can also utilize a WGAViewer program to display real-time GWAS results. Figure 3 illustrates its workflow.

Figure 3

 

Summary of SNPExpress data

Both tissues: 295,696 genes.
Both tissues: 1,411,399 exon probesets .
Both tissues: 571,738 SNPs.
Brain set: 93 subjects.
PBMC set: 80 subjects.
Brain set: 24,775,665 transcription expression data records.
Brain set: 131,260,107 exon expression data records.
PBMC set: 22,111,615 transcription expression data records.
PBMC set: 112,911,920 exon expression data records.
Brain set: 11,055,181 P values for transcript association.
Brain set: 88,657,129 P values for exon association.
PBMC set : 10,650,397 P values for transcript association.
PBMC set: 85,438,942 P values for exon association.

In total:
585,771,630 data points.
2.77 GB in binary format .

 

Contact | ©2008 Dongliang Ge, PhD