.. _label-Examples-PowerAnalysis: Power analysis ===================================== .. _label-Examples-PowerAnalysis-OneSample: One-sample ___________________ Here the Canadian temperature dataset from :ref:`the previous section ` will be used to demonstrate one-sample power analysis. A full script for one-sample power analysis appears below, and immediately below that we shall consider it in more detail. .. plot:: :include-source: import power1d J = 8 # sample size Q = 365 # continuum size # construct baseline geometry: g0 = power1d.geom.GaussianPulse( Q , q=200 , fwhm=190 , amp=40 ) g1 = power1d.geom.Constant( Q , amp=23 ) baseline = g0 - g1 # subtract the geometries # construct signal geometry: signal0 = power1d.geom.Null( Q ) signal1 = power1d.geom.GaussianPulse( Q , q=200 , fwhm=100 , amp=5 ) # construct noise model: noise0 = power1d.noise.Gaussian( J , Q , mu = 0 , sigma = 0.3 ) noise1 = power1d.noise.SmoothGaussian( J , Q , mu = 0 , sigma = 3 , fwhm = 70 ) noise = power1d.noise.Additive( noise0 , noise1 ) # create data sample models: model0 = power1d.models.DataSample( baseline, signal0, noise, J=J ) model1 = power1d.models.DataSample( baseline, signal1, noise, J=J ) # create experiment models: teststat = power1d.stats.t_1sample emodel0 = power1d.models.Experiment( model0 , teststat ) # null hypothesis emodel1 = power1d.models.Experiment( model1 , teststat ) # alternative hypothesis # simulate the experiments: sim = power1d.ExperimentSimulator( emodel0 , emodel1 ) results = sim.simulate( 5000, progress_bar=True ) # visualize the power results: results.plot() All of the commands from the code above up until "# create experiment models" are describe in :ref:`the previous section regarding data modeling `. The only difference is that here two signals and two data sample models are created: one representing the null hypothesis and one representing the alternative hypothesis. In **power1d** the "Null" geometry is a null continuum (i.e. a continuum with all zero values) as seen :ref:`here `. Once the signals and data samples are constructed, the next step is to add them to a **power1d.models.Experiment** object along with a test statistic function. When there is just one data sample model the test statistic function should accept a (J x Q) data array and return a (Q,) array representing the test statistic continuum. Test statistic functions for a variety of common experimental designs are available in :ref:`label-API-stats`. Next the two **Experiment** models, representing the null and alternative hypotheses are added to an **ExperimentSimulator** object. When the *simulate* method is called for *N* iterations (here *N$=5000) the simulator will simualte both the null and the alternative experiments for *N* iterations and store the test statistic continua for both as (N x Q) arrays in the **results.Z0** and **results.Z1** attributes, respectively. Last, the simulation results are summarized graphically using the *plot* method. The top left panel of the results depicts the null and alternative models and lists their omnibus powers. By definition the 'power' of the null model is alpha, and alpha is 0.05 by default. Alpha and all other power-relevant parameters can be changed using the the *results.set_* methods as summarized in :ref:`the power1d.results API `. Here the alternative model has an omnibus power of 0.843. This sugges that, if our noise models are correct, we will reject the null hypothesis (of null signal) with a probability of 0.843. However, this doesn't imply that we will detect the modeled signal with a probability of 0.843 because occassionally we will detect 'signal' which randomly appears in continuum regions outside of the signal region. This point is emphasized in the point-of-interest (POI) and center-of-interest (COI) power continua in the lower right panel. The POI continuum indicates the probability of rejecting the null hypothesis at particular continuum points, and the COI continuum indicates the same but extends the search region to a certain radius (here 3 points) around the POI. The POI and COI continuum results suggest that we are only about 70% likely to detect the signal we have modeled, even though the omnibus power is greater than 80%. If we were to broaden the COI radius to the full continuum size of $Q$=365 using *results.set_coi_radius(365)*, the COI continuum would plateau at the omnbus power of 0.843. Thus the POI and COI continua quantify the regional specificity of the omnibus power results. Two-sample ___________________ A slightly simpler model than the one above will be used to demonstrate power calculations for two-sample and other experiments. In the example below signal is modeled as a GaussianPulse at point 65 in a continuum of length 101 and noise is modeled as uncorreletated Gaussian. The two-sample test statistic function (**power1d.stats.t_2sample_fn**) is used to calculate the test statistic continuum Note that **power1d.stats** functions with "_fn" appended to the end are slightly more efficient calculators than are those without "_fn" appended. This is because the "_fn" functions store calculations results like matrix inversion results so that they don't have to be repeated across every single simulation iteration. More details regarding this are provided in :ref:`label-API-stats`. Also note that, unlike the one-sample results above, **power1d** separates the null and alternative models into separate panels. This is done to prevent single-panel clutter. .. plot:: :include-source: import numpy as np import power1d #(0) Create geometry and noise models: JA,JB,Q = 5, 7, 101 baseline = power1d.geom.Null(Q=Q) signal0 = power1d.geom.Null(Q=Q) signal1 = power1d.geom.GaussianPulse(Q=101, q=65, amp=2.5, sigma=10) noise = power1d.noise.Gaussian(J=5, Q=101, sigma=1) #(1) Create data sample models: modelA0 = power1d.models.DataSample(baseline, signal0, noise, J=JA) #null A modelB0 = power1d.models.DataSample(baseline, signal0, noise, J=JB) #null N modelA1 = power1d.models.DataSample(baseline, signal0, noise, J=JA) #alternative A modelB1 = power1d.models.DataSample(baseline, signal1, noise, J=JB) #alternative B #(2) Create experiment models: teststat = power1d.stats.t_2sample_fn(JA, JB) expmodel0 = power1d.models.Experiment([modelA0, modelB0], teststat) #null expmodel1 = power1d.models.Experiment([modelA1, modelB1], teststat) #alternative #(3) Simulate experiments: np.random.seed(0) sim = power1d.ExperimentSimulator(expmodel0, expmodel1) results = sim.simulate(1000, progress_bar=True) results.plot() Regression ___________________ Regression can be implemented using the "regressor" keyword argument when constructing a **DataSample** model as indicated below. Note that regression models are plotted with a different color scheme from those above: dark and bright colors correspond to small and large regressor values, respectively. In the "Alternative" model below (top right panel) it is apparent that dependent variable values get larger in the vicinity of Q=65 as the regressor values get larger, implying positive correlation between the independent and dependent variables in this vicinity. .. plot:: :include-source: import numpy as np import power1d #(0) Create geometry and noise models: J,Q = 30, 101 x = np.linspace(0, 2, J) #regressor (must have J values) baseline = power1d.geom.Null(Q=Q) signal0 = power1d.geom.Null(Q=Q) signal1 = power1d.geom.GaussianPulse(Q=101, q=65, amp=2.0, sigma=10) noise = power1d.noise.Gaussian(J=5, Q=101, sigma=1) #(1) Create data sample models: model0 = power1d.models.DataSample(baseline, signal0, noise, J=J, regressor=x) model1 = power1d.models.DataSample(baseline, signal1, noise, J=J, regressor=x) #(2) Create experiment models: teststat = power1d.stats.t_regress_fn(x) expmodel0 = power1d.models.Experiment(model0, teststat) expmodel1 = power1d.models.Experiment(model1, teststat) #(3) Simulate experiments: np.random.seed(0) sim = power1d.ExperimentSimulator(expmodel0, expmodel1) results = sim.simulate(100, progress_bar=True) results.plot() One-way ANOVA ___________________ Power analyses for one-way ANOVA can be conducted almost identically to the two-sample analysis described above. The only difference is that more **DataSample** models are needed to represent each of the groups. .. plot:: :include-source: import numpy as np import power1d #(0) Create geometry and noise models: JA,JB,JC,Q = 5, 7, 12, 101 baseline = power1d.geom.Null(Q=Q) signal0 = power1d.geom.Null(Q=Q) signal1 = power1d.geom.GaussianPulse(Q=101, q=65, amp=1.5, sigma=10) noise = power1d.noise.Gaussian(J=5, Q=101, sigma=1) #(1) Create data sample models: modelA0 = power1d.models.DataSample(baseline, signal0, noise, J=JA) #null A modelB0 = power1d.models.DataSample(baseline, signal0, noise, J=JB) #null B modelC0 = power1d.models.DataSample(baseline, signal0, noise, J=JC) #null C modelA1 = power1d.models.DataSample(baseline, signal0, noise, J=JA) #alternative A modelB1 = power1d.models.DataSample(baseline, signal0, noise, J=JB) #alternative B modelC1 = power1d.models.DataSample(baseline, signal1, noise, J=JC) #alternative C #(2) Create experiment models: teststat = power1d.stats.f_anova1_fn(JA, JB, JC) expmodel0 = power1d.models.Experiment([modelA0, modelB0, modelC0], teststat) expmodel1 = power1d.models.Experiment([modelA1, modelB1, modelC1], teststat) #(3) Simulate experiments: np.random.seed(0) sim = power1d.ExperimentSimulator(expmodel0, expmodel1) results = sim.simulate(500, progress_bar=True) results.plot()