1
00:00:07,400 --> 00:00:09,480
People of science. Take one.
2
00:00:14,600 --> 00:00:17,640
David, you chose Thomas Bayes
and Ronald Fisher.
3
00:00:17,640 --> 00:00:20,880
What do both these people
of science mean to you?
4
00:00:20,880 --> 00:00:22,480
A huge amount.
5
00:00:22,480 --> 00:00:26,000
These are two huge figures in the
history of statistical inference.
6
00:00:26,000 --> 00:00:28,360
Bayes, I was introduced
to those ideas
7
00:00:28,360 --> 00:00:30,800
when I first was a student
studying mathematics.
8
00:00:30,800 --> 00:00:32,600
I found them absolutely riveting.
9
00:00:32,600 --> 00:00:35,360
This idea that we could apply
probabilities to facts,
10
00:00:35,360 --> 00:00:37,240
I've stuck with my whole life.
11
00:00:37,240 --> 00:00:40,120
I've been a Bayesian statistician,
as it's known, in my research work.
12
00:00:40,120 --> 00:00:42,840
I teach both of them
but over that time also,
13
00:00:42,840 --> 00:00:46,320
I've come to develop
a huge respect for Fisher.
14
00:00:46,320 --> 00:00:48,440
He was a genius mathematician.
15
00:00:48,440 --> 00:00:50,880
Just about the entire
scientific literature
16
00:00:50,880 --> 00:00:54,680
or anyone who does a statistical
check of a hypothesis,
17
00:00:54,680 --> 00:00:56,920
you use this idea of a p-value.
18
00:00:56,920 --> 00:00:59,320
Ronald Fisher invented the p-value.
19
00:00:59,320 --> 00:01:01,200
So that's when we say...
20
00:01:01,200 --> 00:01:04,160
In particle physics we'll say
we've discovered the Higgs boson,
21
00:01:04,160 --> 00:01:08,000
it's a five sigma discovery
and that's Fisher.
22
00:01:08,000 --> 00:01:09,640
That's Fisher, yes.
23
00:01:09,640 --> 00:01:13,520
Now Bayes, we're going back
a long time to Bayes.
24
00:01:13,520 --> 00:01:16,640
When would Bayes...
Bayes was extraordinary.
25
00:01:16,640 --> 00:01:19,760
He was a nonconformist minister
in Tunbridge Wells
26
00:01:19,760 --> 00:01:21,880
and he was an amateur mathematician.
27
00:01:21,880 --> 00:01:23,920
He died in 1761.
28
00:01:23,920 --> 00:01:28,040
But then afterwards, in his papers
was found a manuscript
29
00:01:28,040 --> 00:01:31,800
that then was published a couple of
years later by the Royal Society
30
00:01:31,800 --> 00:01:35,160
and this manuscript
has become enormously famous
31
00:01:35,160 --> 00:01:37,120
and hugely influential.
32
00:01:37,120 --> 00:01:40,880
Probability around Bayes' time
was used in two different ways.
33
00:01:40,880 --> 00:01:42,880
It was used in the idea of chance.
34
00:01:42,880 --> 00:01:45,720
You know, future events.
Pure unpredictability.
35
00:01:45,720 --> 00:01:48,760
But it was also used
when you were uncertain say
36
00:01:48,760 --> 00:01:51,600
about whether someone was
guilty of a crime or not.
37
00:01:51,600 --> 00:01:54,000
In other words,
uncertainty about a fact.
38
00:01:54,000 --> 00:01:56,000
Bayes put these two together
39
00:01:56,000 --> 00:02:00,080
and to assign probabilities to those
is still deeply controversial.
40
00:02:00,080 --> 00:02:02,640
Fisher loathed the idea.
41
00:02:02,640 --> 00:02:06,480
So whilst Bayes seems
like a relatively nice man,
42
00:02:06,480 --> 00:02:08,520
preaching in Tunbridge Wells,
43
00:02:08,520 --> 00:02:11,560
Fisher is a different
kettle of fish.
44
00:02:11,560 --> 00:02:13,000
Yes.
45
00:02:13,000 --> 00:02:16,160
He's what you might call a
slightly difficult personality.
46
00:02:16,160 --> 00:02:19,080
He could be quite kind
and generous to his students
47
00:02:19,080 --> 00:02:22,280
but if there was any suggestion
that anyone would threaten him
48
00:02:22,280 --> 00:02:25,560
or question him,
he became very aggressive indeed.
49
00:02:25,560 --> 00:02:29,160
He had a foul temper
and he just fell out with people
50
00:02:29,160 --> 00:02:32,840
again and again
for their whole lives.
51
00:02:32,840 --> 00:02:37,440
Which brings me to the question,
you've chosen these two individuals,
52
00:02:37,440 --> 00:02:40,200
so what is the difference
between them?
53
00:02:40,200 --> 00:02:41,960
The core of the disagreement
54
00:02:41,960 --> 00:02:44,920
is whether it's reasonable
to assign probability to a fact.
55
00:02:44,920 --> 00:02:47,320
Something that is potentially
ascertainable
56
00:02:47,320 --> 00:02:49,360
but we just don't know what that is.
57
00:02:49,360 --> 00:02:52,240
Bayes said it was
and developed the calculus,
58
00:02:52,240 --> 00:02:54,800
the mathematics for dealing with it.
59
00:02:54,800 --> 00:02:56,520
He's got this lovely experiment
60
00:02:56,520 --> 00:02:59,560
to do with balls being thrown
on to a billiard table.
61
00:02:59,560 --> 00:03:03,440
So Bayes' thought experiment
was to take a billiard table
62
00:03:03,440 --> 00:03:05,400
and throw a ball at random on to it
63
00:03:05,400 --> 00:03:07,920
and I'm going to guess
where it landed.
64
00:03:15,360 --> 00:03:19,160
OK. So take the ball away. Yes. OK.
65
00:03:19,160 --> 00:03:21,480
I have to guess where that is
66
00:03:21,480 --> 00:03:24,960
and the only information
I'm going to get is what happens
67
00:03:24,960 --> 00:03:27,800
when you throw more balls
on to the table
68
00:03:27,800 --> 00:03:31,800
and you're going to tell me then
which side of that line do they lie.
69
00:03:31,800 --> 00:03:34,520
So could you do that?
Just start throwing balls on.
70
00:03:34,520 --> 00:03:36,520
Just in random directions?
Just random directions.
71
00:03:36,520 --> 00:03:39,120
And random speeds, I suppose.
Yes, and then just...
72
00:03:40,560 --> 00:03:43,600
What you should do now is tell me
how many landed
73
00:03:43,600 --> 00:03:46,000
on this side of the line
and how many landed
74
00:03:46,000 --> 00:03:47,840
on that side of the line.
75
00:03:47,840 --> 00:03:51,640
Three of them are on your left
as you stand like that
76
00:03:51,640 --> 00:03:53,840
and two of them are over here.
OK, right.
77
00:03:53,840 --> 00:03:56,960
You might think then that
I should estimate the line
78
00:03:56,960 --> 00:03:59,440
is two fifths of the way
along the table.
79
00:03:59,440 --> 00:04:02,240
Yes. That's what Fisher would say.
Two fifths.
80
00:04:02,240 --> 00:04:04,040
Bayes would not say that.
81
00:04:04,040 --> 00:04:06,880
He would say it's three sevenths
of the way along the table.
82
00:04:06,880 --> 00:04:09,080
But the data would only say
two fifths
83
00:04:09,080 --> 00:04:11,480
and that's what Fisher would say
just using the data.
84
00:04:11,480 --> 00:04:14,600
Bays would pull it a bit towards
the middle and say it's there.
85
00:04:14,600 --> 00:04:17,920
And what's the difference
between Fisher's approach
and Bayes' approach?
86
00:04:17,920 --> 00:04:22,360
Fisher's approach, he will just use
the information from the data alone.
87
00:04:22,360 --> 00:04:25,720
Whereas the Bayesian approach would
use also the fact that I know
88
00:04:25,720 --> 00:04:29,120
that you threw that first ball
at random to lie on this table.
89
00:04:29,120 --> 00:04:31,960
And that piece of information
actually changes what I think.
90
00:04:31,960 --> 00:04:34,520
I'm going to tell you
something actually.
91
00:04:34,520 --> 00:04:37,480
Because actually the answer was
that I think it was about here,
92
00:04:37,480 --> 00:04:40,200
which is somewhere between
two fifths and three sevenths.
93
00:04:40,200 --> 00:04:42,360
Between two fifths
and three sevenths.
94
00:04:42,360 --> 00:04:44,720
But that was roughly
where the ball was.
95
00:04:48,600 --> 00:04:54,080
How important is the work of Bayes
and Fisher to the modern world?
96
00:04:54,080 --> 00:04:56,280
Bayesian ideas are everywhere.
97
00:04:56,280 --> 00:04:59,200
Your spam filter is probably
a Bayesian spam filter.
98
00:04:59,200 --> 00:05:02,440
All sorts of image processing
techniques, a huge amount
99
00:05:02,440 --> 00:05:06,480
of machine learning, algorithms,
we've based on Bayesian methodology.
100
00:05:06,480 --> 00:05:10,240
And Fisherian methods, again,
staggeringly important.
101
00:05:10,240 --> 00:05:12,400
Every scientific paper you read
102
00:05:12,400 --> 00:05:14,800
is going to have a p-value
at the end of it.
103
00:05:14,800 --> 00:05:18,840
But it's all to do with how
data changes our judgment,
104
00:05:18,840 --> 00:05:21,880
our knowledge,
what we can learn from data.
105
00:05:21,880 --> 00:05:24,480
And that's what the
modern world's about.