Comments | tt6681_theresat: Statistics help, anyone?

tt6681_theresat

Statistics help, anyone?

Aug 09, 2010 12:12

I know that in theory I had a statistics class...10 years ago. From which I remember, oh, approximately nothing. Now I am in a situation where knowing a little bit of statistics might help ( Read more... )

Comments 17

robynjade13 August 9 2010, 20:57:55 UTC

This may or may not help, since I'm using knowledge from a totally different type of experiment.

For luciferase reporter assays, the stats I do are t tests. That compares two things at a time and spits out a p value. So I had to do a separate one for each mutant compared to control. Excel can do it (there's a help file) but I used a program called Prism, which is actually pretty useful. I started out using it just for the much prettier graphs, but it has all sorts of built-in stats.

Another thing you could try would be a Pearson correlation analysis, though that doesn't take into account SD, just average. Also in excel it only gives the pearson r (ideal correlation is -1 or +1 depending on the direction) but no p value. Again, I used prism so YMMV.

Good luck. Stats are a pain.

tt6681_theresat August 9 2010, 21:33:57 UTC

Ooh! Prism! I have Prism...I've used Prism. Not for stats, but for making plots and stuff before...I kind of forgot about it, but its still on my computer :-) I'll have to check out what it can do.

tt6681_theresat August 9 2010, 23:09:06 UTC

PS - THANK YOU THANK YOU THANK YOU. It appears Prism has just what I need, and help that is useful in telling me about the stats stuff I don't know.

robynjade13 August 10 2010, 06:03:57 UTC

Prism: more than just pretty graphs ;)

Glad it's working for you!

petefred August 9 2010, 21:18:10 UTC

ANOVA is probably what you want here -- it looks like you want single factor ANOVA (the number of factors is the number of experimental conditions that you're varying -- for example, if you simultaneously added different combinations of inhibitor and inducer, you'd have a 2 factor model). This may be helpful. Unfortunately I don't know how to do it in excel... I'd plug it into R, but that seems less than helpful given your preferences on software :-/

petefred August 9 2010, 21:19:26 UTC

(Keep in mind that I'm no expert on this, so if one of the many better stats people on your friend list comes along and smacks me, don't say I didn't warn you ;-)

yellowphoton August 10 2010, 04:04:43 UTC

With the caveat that I have no practical experience in actually using statistics ( ... )

yellowphoton August 10 2010, 04:06:06 UTC

heh... and I should really read the comments before I post. Glad to hear you found an implementation that works :-)

petefred August 10 2010, 13:21:46 UTC

if I understand correctly, the intended purpose of an ANOVA test is to compare the variances in two different sets of samples.

That is not my impression... if one only has two sets of samples, you may as well do a t-test. The purpose of ANOVA (in my understanding) is to provide a framework for analyzing the effects of one or more factors, each of which is present in more than two levels -- ANOVA is explicitly designed to be used when you have more than two sets of samples. One can then figure out which factors are significant and what role they play. You're correct that the initial output of ANOVA is just a p value for whether or not a given factor matters, but the information that you get from ANOVA can then be used in post tests that let you compare the mean effects of different levels of a factor (it sounds like that is what is wanted here) or calculate confidence intervals for the mean of each level ( ... )

yellowphoton August 10 2010, 15:37:25 UTC

eh, no that is what I meant, I just didnt say it very clearly.

t-test: compares means between 2 sets
ANOVA: compares *variance* between 2 sets. This can be used to compare *means* between lots of sets (compares variance between means vs variance within sets). Would be overkill for just comparing means between 2 sets.

and, yes, if you want an overall answer for how much the level of blah matters (as opposed to a bunch of separate answers on whether condition A matters, whether condition B matters, etc.) then ANOVA is the way to go.

Thread 5

caethan August 12 2010, 19:10:51 UTC

Sorry for being behind everyone else, and I'll rummage through my stats texts looking for more information, but for right now I just want to say:

DO NOT USE A STUDENTS T TEST FOR THIS ANALYSIS.

t-tests assume that the underlying distribution of your data is normal, which
1) is obviously not the case for the data you're showing, and
2) is never the case for counts reasonably close to zero.

You will get a wrong answer if you try and use a t-test.

caethan August 12 2010, 20:02:47 UTC

Alright, been doing some more reading. More caveats:

ALSO DO NOT USE AN ANOVA.
ANOVAs have the same assumptions of normality that t-tests do, and you will get a wrong answer.

From what I can tell, you should either 1) use a non-parametric test to do your analysis (probably a Mann-Whitney rank test) if you don't know anything about the underlying distribution or 2) if the underlying distributions are Poisson (which they look to my eye like they may be), use a Poisson comparison test like this one.

A Poisson comparison would be more powerful, but I don't know enough about your data to know if it's a Poisson process.

Give me a bit and I'll write up some quick code for running a Mann-Whitney rank test.

tt6681_theresat August 12 2010, 20:15:15 UTC

Actually, no need for the code - that is one of the tests that Prism has in its list, and is actually what it selected for me when I said "not normally distributed" on a check box.

tt6681_theresat August 12 2010, 20:23:47 UTC

Also ( ... )

Thread 6