Gene
The following script will request an Allen gene ID, build the query text, query an Allen database. Gene data may be downloaded via other options, e.g., acronym ('ACCN1'), entrez id (40), etc., instead of an Allen gene id (37 for 'ACCN1'); download and amend the script according to your preference.geneQuery.m
Humans
The ABA data sets of a gene are associated with more than one person. Herein, the details of all the people associated with ABA's genes data sets are downloaded.Expression Levels
The expression levels of a gene, per structure, may be studied on more than one probe. Here, expression levels are extracted from all the probes associated with a gene.expressionLevels.m
Anatomical Structures
The anatomical structures wherein expression measurements were made.anatomicalStructures.m
Descriptive Statistics
Because the ABI usually investigates genes on more than one probe, we have to make a decison about how to use the resulting expression data. There are a number of options, e.g., use the- probe that has the highest set of expressions, on average.
- arithmetic or geometric mean of the [corresponding] expression levels of all the probes.
- arithmetic or geometric mean of the [corresponding] expression levels of the two probes that correlate best.
1. Descriptive Statistics of Expression Levels per Probe of a Gene. There are three probes associated with gene ID 37 (entrez id: 40, acronym: ACCN1). The boxplots summarise the descriptive statistics of the expression data >>
|
2. Line Graphs of Probes per Gene. This is the gene expression data summarised in the boxplots above >>
Per Top Structure
If interested in a particular top structureCode >>
disp([num2cell( (1:size(uniqueTopStructureID,1))' ) uniqueTopStructureNames]) disp('') j = input('Input a unique structure ID from the list above (from the left column): '); if ismember(j, (1:size(uniqueTopStructureID,1))') for r = 1:1:size(humanData.msg, 2) Indices = (donorID == humanData.msg{1,r}.donor_id); Parts = partsID(Indices,:); Expressions = explevels(Indices,:); N = (1:sum(Indices))'; Indices = find(Parts(:,1) == uniqueTopStructureID(j)); if isempty(Indices) continue; end Set = [sortrows([Parts(Indices,:) Expressions(Indices,:)], 2) (1:size(Indices,1))']; iParts = dsearchn(uniqueStructureID, Set(:,2)); figure hold on for n = 1:1:size(Expressions, 2) plot(Set(:,7), Set(:,3 + n), 'x', 'Color', rand(1,3)) end box on [~, I] = unique(Set(:,2), 'first'); set(gca, 'XTick', Set(I,7), 'XTickLabel', structureAbbreviations(iParts(I))) % xticklabel_rotate(Set(I,7), 90, structureAbbreviations(iParts(I))) String = uniqueTopStructureNames{j}; iUpper = strfind(String, ' '); String([1 iUpper + 1]) = upper(String([1 iUpper + 1])); xlabel([String, ' Structures']) ylabel('Expression Level') title({String, ['Gene ID: ', num2str(geneID), ', Donor: ' humanData.msg{1,r}.name]}) legendText = strcat(repmat({'Probe ID'}, numel(probesSet), 1), repmat({': '}, numel(probesSet), 1), cellstr(num2str(probesSet))); legend(legendText, 'location', 'northeastoutside') hold off end else errordlg('Invalid Structure ID') end
Comparative Analysis
The log-log Graph of the 2 probe data sets, per donor, that correlate best (and the correlation values).Code >>
if size(probesSet, 1) > 1 compareProbes = cell(1, size(humanData.msg, 2)); for r = 1:1:size(humanData.msg, 2) Indices = (donorID == humanData.msg{1,r}.donor_id); Parts = partsID(Indices,:); Expressions = explevels(Indices,:); N = (1:sum(Indices))'; % Correlations Correlations = corr(Expressions, 'type', 'Spearman'); disp(['Allen Gene ID: ' num2str(genesParameters.msg{1,1}.id), ', Donor: ' humanData.msg{1,r}.name, ' -->']) disp([[{'Probe ID'}; num2cell(probesSet)] num2cell([probesSet'; Correlations])]) % The Most Correlative Pair Correlations = Correlations - eye(size(Correlations)); [Maximums, cIndices] = max(Correlations, [], 2); [~, rIndices] = max(Maximums); iCompare = [rIndices cIndices(rIndices)]; % Check R = Correlations(rIndices,cIndices(rIndices)); % Comparative Analysis figure maloglog( Expressions(:,iCompare(1)), Expressions(:, iCompare(2))); xlabel(['Probe ID: ', num2str(probesSet(iCompare(1)))]) ylabel(['Probe ID: ', num2str(probesSet(iCompare(2)))]) title({genesParameters.msg{1,1}.name, ['(Allen Gene ID: ' num2str(genesParameters.msg{1,1}.id), ', Donor: ' humanData.msg{1,r}.name, ', Correlation: ', num2str(R), ')']}) compareProbes{1,r} = iCompare; end end
Geometric Means
Observe the effect of using the geometric means of the expression levels (per point) of the two probes that correlate best.Code >>
if size(probesSet, 1) > 1 for r = 1:1:size(humanData.msg, 2) Indices = (donorID == humanData.msg{1,r}.donor_id); N = (1:sum(Indices))'; Parts = partsID(Indices,:); Expressions = explevels(Indices,:); Expressions = Expressions(:, compareProbes{1,r}); Expressions(Expressions < 0) = 0; uniqueParts = unique(Parts, 'rows'); uniqueValues = zeros(size(uniqueParts, 1), 1); uniqueCorrelations = zeros(size(uniqueParts, 1), 1); valuesSeries = zeros(sum(Indices), 1); correlationsSeries = zeros(sum(Indices), 1); for n = 1:1:size(uniqueParts, 1) Index = ismember(Parts, uniqueParts(n,:), 'rows'); iN = N(Index); Set = Expressions(Index, :); iSet = ~any(Set == 0, 2); if sum(iSet) == 1 uniqueValues(n,1) = geomean(Set(iSet,:), 2); uniqueCorrelations(n,1) = 1 - erf(std(Set(iSet,:))); valuesSeries(iN(iSet)) = uniqueValues(n,1); correlationsSeries(iN(iSet)) = uniqueCorrelations(n,1); continue; elseif sum(iSet) == 0 continue; end uniqueValues(n,1) = geomean(geomean(Set(iSet,:), 2)); Correlation = corr(Set(iSet,:), 'type', 'Spearman'); uniqueCorrelations(n,1) = Correlation(1,2); valuesSeries(iN(iSet)) = uniqueValues(n,1); correlationsSeries(iN(iSet)) = uniqueCorrelations(n,1); end correlationColour = [191 191 0]/255; figure plot(N, Expressions(:, 1), 'b+', N, Expressions(:, 2), 'k+', N, valuesSeries, 'g+') hold on plot(N, correlationsSeries, '.', 'Color', correlationColour) xlabel('Data Points of Probes') ylabel('Expression Levels') title({genesParameters.msg{1,1}.name, ['(Allen Gene ID: ' num2str(genesParameters.msg{1,1}.id), ', Donor: ' humanData.msg{1,r}.name]}) legendText = [strcat(repmat({'Probe ID'}, numel(probesSet(compareProbes{1,r})), 1), repmat({': '}, numel(probesSet(compareProbes{1,r})), 1), ... cellstr(num2str(probesSet(compareProbes{1,r}))));... 'Geometric Mean per Smallest Distinct Structure'; 'Correlation r per Distinct Structure Values']; legend(legendText) figure plot(N, Expressions(:, 1), 'b+', N, Expressions(:, 2), 'k+', N, geomean(Expressions,2), 'g+') hold on plot(N, correlationsSeries, '.', 'Color', correlationColour) xlabel('Data Points of Probes') ylabel('Expression Levels') title({genesParameters.msg{1,1}.name, ['(Allen Gene ID: ' num2str(genesParameters.msg{1,1}.id), ', Donor: ' humanData.msg{1,r}.name]}) legendText = [strcat(repmat({'Probe ID'}, numel(probesSet(compareProbes{1,r})), 1), repmat({': '}, numel(probesSet(compareProbes{1,r})), 1), ... cellstr(num2str(probesSet(compareProbes{1,r}))));... 'Geometric Mean of Corresponding Points per Probe'; 'Correlation r per Distinct Structure Values']; legend(legendText) end end