Presentiment Paper Discussion

...is essentially the same as Chris's suggestion that bias is eliminated when instead of averaging over actual number of calm/emotional trials, we average over expected number of calm/emotional trials - this is what your term t/2 represents.

I say that you got there from different starting points because, as best I can tell, Chris got there by starting from thinking about how biased averages could be corrected, and, as best I can tell, you started from thinking about how unbiased sums could be capitalised upon.

No, I think what was in my mind was more a useful way of rescaling the unbiased sums. Strictly I don't think the sum divided by the expected number of trials should be called an average.

Anyhow, the good news is that Linda now seems to be happy with Jay's statement about the sums being unbiased. So we don't need to have any more discussions about sums of no elements. :D:D:D:D:D

The bad news is that she now seems to be suggesting that the bias is actually due to the averages being undefined for sequences of all calm or all emotional trials. :( (Though what she says isn't very clear and anyway it may have changed by tomorrow. ;)) That's not what causes the bias. It's caused by the fact that the response tends to be lower overall for sequences with more emotional trials. Any sensitivity to assumptions about all-calm or all-emotional sequences will be absolutely tiny in real experiments, because such sequences would be incredibly rare.
 
No, I think what was in my mind was more a useful way of rescaling the unbiased sums. Strictly I don't think the sum divided by the expected number of trials should be called an average.

Oh, dude. So you got to it in the same way that Jay did. You are a mere mortal after all. :-P (I had been so impressed with this insight because I had assumed that you had gotten to it through the starting point of averages rather than sums - I think you will agree that it is much more impressive to arrive at it from this starting point which I assumed than from the one you actually did. Oh well! You still have plenty of smart credits in my eyes!)

Anyhow, the good news is that Linda now seems to be happy with Jay's statement about the sums being unbiased. So we don't need to have any more discussions about sums of no elements. :D:D:D:D:D

The bad news is that she now seems to be suggesting that the bias is actually due to the averages being undefined for sequences of all calm or all emotional trials. :( (Though what she says isn't very clear and anyway it may have changed by tomorrow. ;))

All I can say is that I look forward to Linda's response/clarification.

That's not what causes the bias. It's caused by the fact that the response tends to be lower overall for sequences with more emotional trials.

You've written this a few times now and I've never understood what you meant, and have always suspected that it stems from a misunderstanding of your own, but have never engaged you on it until now, so here goes.

Which response do you mean? Response over calm trials or over emotional trials? Doesn't, under these assumptions, the difference between these (calm/emotional responses) cancel out, regardless of whether there are fewer or more emotional trials? And isn't the true cause of the bias as identified by Dalkvist et al: that there is a fundamental difference in the averages (of calm/emotional trials) resulting from the two similar scenarios where a string of N calm trials ends in a calm trial versus when that string of N calm trials ends instead in an emotional trial? And yes, it's more complicated than this, but isn't this the essence of it?

Any sensitivity to assumptions about all-calm or all-emotional sequences will be absolutely tiny in real experiments, because such sequences would be incredibly rare.

Agreed.
 
You've written this a few times now and I've never understood what you meant, and have always suspected that it stems from a misunderstanding of your own, but have never engaged you on it until now, so here goes.

Which response do you mean? Response over calm trials or over emotional trials? Doesn't, under these assumptions, the difference between these (calm/emotional responses) cancel out, regardless of whether there are fewer or more emotional trials? And isn't the true cause of the bias as identified by Dalkvist et al: that there is a fundamental difference in the averages (of calm/emotional trials) resulting from the two similar scenarios where a string of N calm trials ends in a calm trial versus when that string of N calm trials ends instead in an emotional trial? And yes, it's more complicated than this, but isn't this the essence of it?

Yes - in sequences where there are more emotional trials the response tends to be lower in both the calm and the emotional trials. But the effect doesn't cancel out, because the sequences where there are more emotional trials are necessarily sequences where there are fewer calm trials. So in an average, the effect is actually reinforced.

I've just been trying to think of an easy way of getting this across intuitively, but it would involve extreme examples, and it might confuse some people even more. But I think if you look at the tables you've produced, they will illustrate how the average emotional response per sequence correlates negatively with the number of emotional trials in the sequence, and the average calm response per sequence correlates positively with the number of calm trials in the sequence. And that those correlations will work through to produce an expected value of the average emotional response that's higher than the expected value of the average calm response.
 
So, Chris, this is all very interesting. :-)

I tested your assertion that average emotional response correlates negatively with number of emotional trials, and that average calm response correlates positively with number of calm trials, and, as best I can tell, the test came up false (edit: oh, actually, I'm wrong, because these are correlations in the same direction whereas I had originally assumed that they were in opposite directions, and in fact the correlations are borne out after all, however I don't believe they have explanatory power in and of themselves). But there was a fascinating pattern and set of observations that I think makes up for this failure (edit: OK, not a failure after all, but since the correlations go in the same direction, they don't seem to explain the bias in and of themselves).

Here's how I tested it: I picked a semi-arbitrary number of trials, four, corresponding to the maximum that I thought it would be feasible to iterate over and work with. Then I tabulated the per-sequence averages (edit: of arousal levels, of course) for both calm and emotional trials over each of the full set of sequences (assuming a reset arousal level of zero rather than one). Here's the result:

Code:
Seq    Avg(C) Avg(E)
--------------------
CCCC   3/2    n/a
CCCE   1      3
CCEC   1/3    2
CCEE   1/2    1
CECC`  1/3    1
CECE   0      1
CEEC   0      1/2
CEEE   0      1/3
ECCC   1      0
ECCE   1/2    1
ECEC   0      1/2
ECEE   0      1/3
EECC   1/2    0
EECE   0      1/3
EEEC   0      0
EEEE   n/a    0

Next, I considered all sequences containing exactly 4, 3, 2, 1 and 0 calm trials (and thus, respectively, 0, 1, 2, 3 and 4 emotional trials), and listed the averages associated with each of those from the previous table. Here's the result:

Code:
#Cs  Avgs(C)                   Avgs(E)
--------------------------------------
 4   3/2                       n/a
 3   1, 1/3, 1/3, 1            3, 2, 1, 0
 2   1/2, 0, 0, 1/2, 0, 1/2    1, 1, 1/2, 1, 1/2, 0
 1   0, 0, 0, 0                1/3, 1/3, 1/3, 0
 0   n/a                       0

Finally, I calculated the averages of each of these averages, as well as the number of sequences associated with each average, and then, just for extra measure, I calculated the sum of the averages:

Code:
#Cs  Avg(Avgs(C))  Avg(Avgs(E))  #sequences  Sum(Avgs((C))  Sum(Avgs((E))
 4      3/2           n/a            1           3/2             n/a
 3      2/3           3/2            4           8/3             6
 2      1/4           2/3            6           3/2             4
 1      0             1/4            4           0               1
 0      n/a           0              1           n/a             0

Isn't that fascinating? Do you notice the diagonal pattern between averages? As I said, it doesn't seem that your correlations hold (edit: see edit above; your correlations hold after all, however I don't think they explain anything in and of themselves), but the fact that the following three observations hold seems to result in the bias that we see:

  • There is a diagonal pattern where the next Avg(Avgs(E)) is equal to the current Avg(Avgs(C)), wrapping.
  • Aside from wrapping of the diagonal pattern (which I don't think anyway changes this observation), averages of averages of both calm and emotional trials are positively correlated with number of calm trials.
  • The pattern of number of sequences versus #Cs corresponds (as we would expect, since it represents binary combinations) with a line from Pascal's triangle, and, given that (a) only for the maximum number of Cs (4), Avg(Avgs(C)) > Avg(Avgs(E)) (if we can even say that, since in that case, Avg(Avgs(E)) is undefined), and that (b) the number of sequences is minimal at this point, and that (c) for all other numbers of Cs, Avg(Avgs((E)) > Avg(Avgs(C)) (or Avg(Avgs((C)) is undefined), overall, across all sequences and #Cs, the average of the Avg(Avgs((E))s when weighted by #sequences (i.e. equating to Sum(Avgs(C))) is greater than the average of the Avg(Avgs((C))s when weighted by #sequences (i.e. equating to Sum(Avgs(E))), the overall average for emotional trials is greater than that for calm trials... and hence we get our bias.

Now, we just need to relate this somehow to Dalkvist et al's insight! I'm pretty sure that these results stem from their insight, it just doesn't seem straightforward to explain the results in terms of it.
 
Last edited:
Something else worth noting is that, from what I've read of it so far, this is along the lines that Wackermann's derivation goes.
 
Laird

That's interesting, but I think we're probably just looking at the same thing from two different perspectives.

Considering that we know the sums are unbiased, I think the bias in the averages must be the result of some kind of correlation between the responses and the numbers of trials. If the expected mean response for given numbers of calm and emotional of trials didn't vary with the numbers of trials, then the overall expected mean response would have the same value, and would be unbiased.
 
Last edited by a moderator:
The bad news is that she now seems to be suggesting that the bias is actually due to the averages being undefined for sequences of all calm or all emotional trials.

Well, that's essentially what I think too. Let µ be the true average per-person difference between the response for an activating and a calm trial. Let S be a finite sample space of the type of experiment and carryover we've been talking about. Then, using the notation of my most recent post, as an estimator of µ, D̅ = A̅ – C̅ is undefined on S because of the two sequences, Xa and Xc, in S that are comprised exclusively of activating or calm stimuli, respectively. Therefore, we have to do one of two things. The first thing we can do is decide a priori that if any Xa's or Xc's appear in our data set, we'll throw them away. That amounts to changing the sample space to S' = S \ {Xa, Xc}. Then we can compute D̅ on our sample from S'; however E[D̅(S')] ≠ µ.

The other thing we can do is to agree a priori to use two constants, c1 and c2, to represent the otherwise uncomputable C̅a and A̅c , of any Xa's and Xc's, respectively, that appear in our data set. Then we can compute D̅ on the original sample space S, but then E[D̅(S)] ≠ µ unless we make miraculous choices for c1 and c2.

To summarize, any practical thing we can do to create a well-defined estimator D̅, results in E[D̅] ≠ µ, and the only reason we have to do anything at all is because of the two sequences comprising either all activating or all calm stimuli. So the cause of the bias is the existence of these two sequences in the sample space that force us to create a biased estimator.
 
Last edited:
I think it might be interesting to see an analysis of D*.

What about D* would you like to see analyzed?

What I was drawing attention to was that, despite the unbiased definition, the carry-over will show up in almost every realization of the experiment. And that the carryover isn't produced by the bias produced by undefined sequences.

I think what you're getting at is along the lines of something I've stated twice now, but which no one has yet reacted to, so I'm not sure if its importance has registered for anybody. So, once again:

First, let S be the sample space of all possible sequences in a trial that employs s-length sequences. Let Q be the set of sequences that are realized in a particular experiment. Since carryover differs from sequence to sequence, then even though D* is an unbiased estimator of µ, if Q is randomly generated then it is unlikely that D*|Q ("D* given Q") will be unbiased. This is why Q should not be randomly generated; it should be fixed, and unless a subset of Q known to be unbiased under the assumed carryover model can be found, then Q should equal S.

Second, all the discussion so far has tacitly assumed that the carryover effect is identical from subject to subject; however, that assumption is implausible. Let's assume that the form of the carryover effect is incremented by c after a calm trial and reset to baseline after an activating trial. Then it is almost sure that the value of c will differ from subject to subject. But carryover also differs from sequence to sequence. So if a subject whose value of c is high, say, is randomized to a sequence with a lot of consecutive Cs and hence a large carryover effect, this will bias D*. That is to say that if P is the set of participants in the experiment, then D*|P may be biased. The way to overcome this is to randomize many subjects to each sequence in the experiment.

The above two considerations taken together imply that, to be reasonably confident that D*|S,P ("D* given S and P") is approximately unbiased, the sequence length in the experiment should be short, so that the size of S is small, so that many subjects may be randomized to each of the fixed set of sequences S. But my suspicion is that parapsychologists are doing the exact opposite: generating long, random sequences, with the number of participants much less than the number of sequences in S.
 
Last edited:
Edit: this is, in the end, a pretty irrelevant post. The one I posted after this (#292) is much more to the point.

Jay, I think you've missed a possible approach to finding the average per-subject difference; an approach that effectively eliminates the problem of undefined values (by "cancelling them out". Edit: this is wrong; based on my findings in post #292 below, the values can be extrapolated and are not the same, thus they cannot "cancel out"), and which demonstrates that the bias remains even when this has been done. I outlined it in my above post to Linda: rather than averaging per-subject differences, take the difference of per-subject averages. This way, we exclude only averages which are undefined, rather than entire differences which are undefined, the disadvantage of the latter being that we "throw away" the information from the other defined average in the difference.

This, as I commented in that earlier post, is suggested by the approach that Dalkvist et al take in some of their tables. Here's an example over sequences with only two trials (assuming reset arousal level of 0):

Code:
Seq  Avg(C)  Avg(E)  Diff
-------------------------
CC    1/2     n/a    n/a
CE    0       1      1
EC    0       0      0
EE    n/a     0      n/a

Avg   1/6     1/3

Using the approach you suggest, we would either fill in "0" for the n/a's (undefined values) in the first and final rows under "Diff", or we would simply exclude those rows when averaging - but both of these approaches "throw away" the information from the 1/2 in the first row under Avg(C) and from the 0 in the final row under Avg(E), and thus we might reasonably expect, as you say, that a bias would result. In any case, the results of each of these approaches (respectively) are 1/4 and 1/2.

The approach you seem to have ignored is in contrast to diff the overall averages, the result of which is 1/3 - 1/6 = 1/6. This approach does not "throw away" any information at all, it simply excludes the two undefined averages, which the final table in my last post suggests are for all intents and purposes identical/interchangeable (according to the pattern. Edit: this is wrong; they are not interchangeable; the pattern seems to be a shift not a wrap-around). The bias, however, remains even in this case, where we are not throwing away any information, and where, as my previous sentence suggests, we "cancel out" the effects of the undefined averages (edit: again, this was wrong; there can be no cancelling because the values are different). This suggests to me that the bias is a result of more than just undefined values. Dalkvist et al have given us a hint as to what that "more" is with their commentary on CC versus CE in a table similar to that above, and there are other intriguing hints in the final table that I drew up in my previous post - perhaps these are simply two different but equally valid ways of analysing the cause of the bias.

Your point that the discussion so far has assumed that expectation effects will be the same between subjects is well taken, and your recommendations in that respect seem appropriate, with the caveat raised by Dalkvist et al (2002): that under some theoretical models of presentiment (time reversal), an effect will not be possible unless the upcoming trial is selected randomly in real time, which cannot be achieved using this approach, because the entire sequence is fixed ahead of time. I'd also add that the discussion so far has (mostly) been operating under another, questionable, assumption: that expectation effects occur to any significant extent anyway. Analysis so far (whether or not it is sufficiently powered) has failed to detect any.
 
Last edited:
Well, that's essentially what I think too. Let µ be the true average per-person difference between the response for an activating and a calm trial. Let S be a finite sample space of the type of experiment and carryover we've been talking about. Then, using the notation of my most recent post, as an estimator of µ, D̅ = A̅ – C̅ is undefined on S because of the two sequences, Xa and Xc, in S that are comprised exclusively of activating or calm stimuli, respectively. Therefore, we have to do one of two things. The first thing we can do is decide a priori that if any Xa's or Xc's appear in our data set, we'll throw them away. That amounts to changing the sample space to S' = S \ {Xa, Xc}. Then we can compute D̅ on our sample from S'; however E[D̅(S')] ≠ µ.

The other thing we can do is to agree a priori to use two constants, c1 and c2, to represent the otherwise uncomputable C̅a and A̅c , of any Xa's and Xc's, respectively, that appear in our data set. Then we can compute D̅ on the original sample space S, but then E[D̅(S)] ≠ µ unless we make miraculous choices for c1 and c2.

To summarize, any practical thing we can do to create a well-defined estimator D̅, results in E[D̅] ≠ µ, and the only reason we have to do anything at all is because of the two sequences comprising either all activating or all calm stimuli. So the cause of the bias is the existence of these two sequences in the sample space that force us to create a biased estimator.

The correlations I've described above - which relate to all the possible sequences - will clearly lead to bias, so whether the treatment of these two exceptional sequences introduces a bit of extra bias seems a rather academic point.

Correct me if I'm wrong, but any effect owing to the treatment of these two sequences will be exponentially small for large N, whereas Wackermann found that the bias scaled like 1/N.
 
The correlations I've described above - which relate to all the possible sequences - will clearly lead to bias, so whether the treatment of these two exceptional sequences introduces a bit of extra bias seems a rather academic point.

In fact, my investigations strongly suggest that rather than the exclusion of these sequences adding a bit of extra bias, by excluding them or setting their difference to zero, we remove bias

I wrote a little script to generate the same tables of my above post but for an arbitrary number of trials per sequence. The maximum I was able to run it at before I ran out of memory on my little laptop here was 17 trials per sequence, but I tested several smaller values to confirm that these patterns hold across different values. All of this revealed that there are even more patterns in the data. The following table corresponds to the final table in my post a few up, only for 17 trials per sequence rather than 4:

Code:
#Cs  Avg(Avgs(C))  Avg(Avgs(E))  #sequences  Sum(Avgs(C))  Sum(Avgs(E))
 17     8             n/a              1            8            n/a
 16     5             8               17           85            136
 15     3.5           5              136          476            680
 14     2.6           3.5            680         1768           2380
 13     2             2.6           2380         4760           6188
 12     1.571429      2             6188         9724          12376
 11     1.25          1.571429     12376        15470          19448
 10     1             1.25         19448        19448          24310
  9     0.8           1            24310        19448          24310
  8     0.636364      0.8          24310        15470          19448
  7     0.5           0.636364     19448         9724          12376
  6     0.384615      0.5          12376         4760           6188
  5     0.285714      0.384615      6188         1768           2380
  4     0.2           0.285714      2380          476            680
  3     0.125         0.2            680           85            136
  2     0.058824      0.125          136            8             17
  1     0             0.058824        17            0              1
  0     n/a           0                1          n/a              0

Now we can see that not only is there a diagonal pattern across the averages of averages, and a Pascal's triangle-row pattern in #sequences, but there are also symmetric patterns in the sums of averages, centred around #C=9,10. Thus, it is easy to extrapolate the undefined value for the first row of Sum(Avgs(E)); it is 17. From this, we can calculate the undefined value for the first row of Avg(Avgs(E)) by dividing 17 by the value of #sequences i.e. 1. Thus, we have extrapolated the first missing average: it is 17, the same as the number of trials. Hooray. (Checking at other values for number of trials per sequence, it seems that this rule holds in general: this missing average is equal to the number of trials per sequence).

I am not an expert at fitting curves, but I also tried using LibreOffice to fit the curve of #Cs versus Avg(Avgs(E)), and from there to extrapolate the value of Avg(Avgs(E)) at #C = 17, and whilst I didn't have perfect success, or even conclude which type of curve best fits - I'd say tentatively though that the best choice seems to be polynomial followed by exponential - I did come up with values "close enough" to 17, which confirms that this is at least a plausible value.

The interesting thing about this is that, as I wrote above, it does not modulate the bias, it only increases it. So, clearly, based on this extrapolation at least, the exclusion of undefined values is not the cause of the bias.

The second undefined value/s in the final row is/are, admittedly, harder to extrapolate. Should the final row of Sum(Avgs(C)) go negative, should it remain at zero, or should it return to 8? There's no definitive answer from the data, but based on the diagonal pattern in the second two columns, it seems that the most likely extrapolation is a slightly negative number or at least zero.

Thus, the exclusion of this undefined value has not been a cause of the bias either - when we add in the most likely extrapolated value, we either contribute to (for the extrapolation of a negative value) or do not contribute to (for the extrapolation of zero) the bias; but either way, the difference associated with this sequence seems not to be biased in the opposite direction as all of the others.
 
Last edited:
Just in case it's not implicitly clear from the above post: I think I was wrong in my earlier post to talk about the diagonal sequence "wrapping"; I no longer believe that it wraps; instead it is the same pattern shifted down one, where the missing averages are not, as I suggested in an earlier post, "identical/interchangeable"; they are definitely different.
 
Just returning to something with which I've already agreed:

Any sensitivity to assumptions about all-calm or all-emotional sequences will be absolutely tiny in real experiments, because such sequences would be incredibly rare.

This is demonstrated in the table two posts back, where the contribution to the sum of the differences between averages by these two sequences is (17 - 8) + (0 - zero or some very small negative number) which is approximately equal to 9, whereas the sum of the differences contributed by all other sequences is unimaginably massive in comparison; for example, in the case of sequences with 9 calm trials, the sum of the differences is (24310 - 19448) x 24310, which equals 118,195,220.

Approximately 9 versus 118,195,220 (and the rest!). Jay and Linda, can you see that the bias is due to more than just leaving out the sequences of all-calm and all-emotional trials i.e. the approximately 9?
 
There are so many errors in Linda's most recent post that I thought that, because this is a confusing enough subject to understand without misinformation, it was important to go through and point them all out.

Thank you for making your intentions explicit. If you don't mind, trying to go through your post and straighten it out under those conditions seems a bit futile.

Linda
 
What about D* would you like to see analyzed?

Sorry - I meant an analysis (as in, the analyses/simulations some of these authors undertook) using D*.

I think what you're getting at is along the lines of something I've stated twice now, but which no one has yet reacted to, so I'm not sure if its importance has registered for anybody.

Yeah, after I wrote it I thought, "wait a minute, isn't that similar to what Jay said earlier...?" :)

So, once again:

First, let S be the sample space of all possible sequences in a trial that employs s-length sequences. Let Q be the set of sequences that are realized in a particular experiment. Since carryover differs from sequence to sequence, then even though D* is an unbiased estimator of µ, if Q is randomly generated then it is unlikely that D*|Q ("D* given Q") will be biased. This is why Q should not be randomly generated; it should be fixed, and unless a subset of Q known to be unbiased under the assumed carryover model can be found, then Q should equal S.

Did you mean for the bolded word to be "unbiased"?

Second, all the discussion so far has tacitly assumed that the carryover effect is identical from subject to subject; however, that assumption is implausible. Let's assume that the form of the carryover effect is incremented by c after a calm trial and reset to baseline after an activating trial. Then it is almost sure that the value of c will differ from subject to subject. But carryover also differs from sequence to sequence. So if a subject whose value of c is high, say, is randomized to a sequence with a lot of consecutive Cs and hence a large carryover effect, this will bias D*. That is to say that if P is the set of participants in the experiment, then D*|P may be biased. The way to overcome this is to randomize many subjects to each sequence in the experiment.

The above two considerations taken together imply that, to be reasonably confident that D*|S,P ("D* given S and P") is approximately unbiased, the sequence length in the experiment should be short, so that the size of S is small, so that many subjects may be randomized to each of the fixed set of sequences S. But my suspicion is that parapsychologists are doing the exact opposite: generating long, random sequences, with the number of participants much less than the number of sequences in S.

Some are suggesting single trials with a large number of participants.

Linda
 
Did you mean for the bolded word to be "unbiased"?

Yes. Thanks. Fixed in the original.

Some are suggesting single trials with a large number of participants.

That is really what they should do. I didn't bring it up, because I figured I'd just get the usual "parapsychology doesn't have the money" response.
 
That is really what they should do. I didn't bring it up, because I figured I'd just get the usual "parapsychology doesn't have the money" response.

Yes, if you or I suggest it, it will be unreasonable. :)

Linda
 
The correlations I've described above - which relate to all the possible sequences - will clearly lead to bias, so whether the treatment of these two exceptional sequences introduces a bit of extra bias seems a rather academic point.

I did not mean to imply that the two special sequences require extra work. The two special sequences imply that the estimator is undefined. So since the estimator is undefined, the correlations you've described seems a rather academic point.
 
Back
Top