1 00:00:07,400 --> 00:00:09,480 People of science. Take one. 2 00:00:14,600 --> 00:00:17,640 David, you chose Thomas Bayes and Ronald Fisher. 3 00:00:17,640 --> 00:00:20,880 What do both these people of science mean to you? 4 00:00:20,880 --> 00:00:22,480 A huge amount. 5 00:00:22,480 --> 00:00:26,000 These are two huge figures in the history of statistical inference. 6 00:00:26,000 --> 00:00:28,360 Bayes, I was introduced to those ideas 7 00:00:28,360 --> 00:00:30,800 when I first was a student studying mathematics. 8 00:00:30,800 --> 00:00:32,600 I found them absolutely riveting. 9 00:00:32,600 --> 00:00:35,360 This idea that we could apply probabilities to facts, 10 00:00:35,360 --> 00:00:37,240 I've stuck with my whole life. 11 00:00:37,240 --> 00:00:40,120 I've been a Bayesian statistician, as it's known, in my research work. 12 00:00:40,120 --> 00:00:42,840 I teach both of them but over that time also, 13 00:00:42,840 --> 00:00:46,320 I've come to develop a huge respect for Fisher. 14 00:00:46,320 --> 00:00:48,440 He was a genius mathematician. 15 00:00:48,440 --> 00:00:50,880 Just about the entire scientific literature 16 00:00:50,880 --> 00:00:54,680 or anyone who does a statistical check of a hypothesis, 17 00:00:54,680 --> 00:00:56,920 you use this idea of a p-value. 18 00:00:56,920 --> 00:00:59,320 Ronald Fisher invented the p-value. 19 00:00:59,320 --> 00:01:01,200 So that's when we say... 20 00:01:01,200 --> 00:01:04,160 In particle physics we'll say we've discovered the Higgs boson, 21 00:01:04,160 --> 00:01:08,000 it's a five sigma discovery and that's Fisher. 22 00:01:08,000 --> 00:01:09,640 That's Fisher, yes. 23 00:01:09,640 --> 00:01:13,520 Now Bayes, we're going back a long time to Bayes. 24 00:01:13,520 --> 00:01:16,640 When would Bayes... Bayes was extraordinary. 25 00:01:16,640 --> 00:01:19,760 He was a nonconformist minister in Tunbridge Wells 26 00:01:19,760 --> 00:01:21,880 and he was an amateur mathematician. 27 00:01:21,880 --> 00:01:23,920 He died in 1761. 28 00:01:23,920 --> 00:01:28,040 But then afterwards, in his papers was found a manuscript 29 00:01:28,040 --> 00:01:31,800 that then was published a couple of years later by the Royal Society 30 00:01:31,800 --> 00:01:35,160 and this manuscript has become enormously famous 31 00:01:35,160 --> 00:01:37,120 and hugely influential. 32 00:01:37,120 --> 00:01:40,880 Probability around Bayes' time was used in two different ways. 33 00:01:40,880 --> 00:01:42,880 It was used in the idea of chance. 34 00:01:42,880 --> 00:01:45,720 You know, future events. Pure unpredictability. 35 00:01:45,720 --> 00:01:48,760 But it was also used when you were uncertain say 36 00:01:48,760 --> 00:01:51,600 about whether someone was guilty of a crime or not. 37 00:01:51,600 --> 00:01:54,000 In other words, uncertainty about a fact. 38 00:01:54,000 --> 00:01:56,000 Bayes put these two together 39 00:01:56,000 --> 00:02:00,080 and to assign probabilities to those is still deeply controversial. 40 00:02:00,080 --> 00:02:02,640 Fisher loathed the idea. 41 00:02:02,640 --> 00:02:06,480 So whilst Bayes seems like a relatively nice man, 42 00:02:06,480 --> 00:02:08,520 preaching in Tunbridge Wells, 43 00:02:08,520 --> 00:02:11,560 Fisher is a different kettle of fish. 44 00:02:11,560 --> 00:02:13,000 Yes. 45 00:02:13,000 --> 00:02:16,160 He's what you might call a slightly difficult personality. 46 00:02:16,160 --> 00:02:19,080 He could be quite kind and generous to his students 47 00:02:19,080 --> 00:02:22,280 but if there was any suggestion that anyone would threaten him 48 00:02:22,280 --> 00:02:25,560 or question him, he became very aggressive indeed. 49 00:02:25,560 --> 00:02:29,160 He had a foul temper and he just fell out with people 50 00:02:29,160 --> 00:02:32,840 again and again for their whole lives. 51 00:02:32,840 --> 00:02:37,440 Which brings me to the question, you've chosen these two individuals, 52 00:02:37,440 --> 00:02:40,200 so what is the difference between them? 53 00:02:40,200 --> 00:02:41,960 The core of the disagreement 54 00:02:41,960 --> 00:02:44,920 is whether it's reasonable to assign probability to a fact. 55 00:02:44,920 --> 00:02:47,320 Something that is potentially ascertainable 56 00:02:47,320 --> 00:02:49,360 but we just don't know what that is. 57 00:02:49,360 --> 00:02:52,240 Bayes said it was and developed the calculus, 58 00:02:52,240 --> 00:02:54,800 the mathematics for dealing with it. 59 00:02:54,800 --> 00:02:56,520 He's got this lovely experiment 60 00:02:56,520 --> 00:02:59,560 to do with balls being thrown on to a billiard table. 61 00:02:59,560 --> 00:03:03,440 So Bayes' thought experiment was to take a billiard table 62 00:03:03,440 --> 00:03:05,400 and throw a ball at random on to it 63 00:03:05,400 --> 00:03:07,920 and I'm going to guess where it landed. 64 00:03:15,360 --> 00:03:19,160 OK. So take the ball away. Yes. OK. 65 00:03:19,160 --> 00:03:21,480 I have to guess where that is 66 00:03:21,480 --> 00:03:24,960 and the only information I'm going to get is what happens 67 00:03:24,960 --> 00:03:27,800 when you throw more balls on to the table 68 00:03:27,800 --> 00:03:31,800 and you're going to tell me then which side of that line do they lie. 69 00:03:31,800 --> 00:03:34,520 So could you do that? Just start throwing balls on. 70 00:03:34,520 --> 00:03:36,520 Just in random directions? Just random directions. 71 00:03:36,520 --> 00:03:39,120 And random speeds, I suppose. Yes, and then just... 72 00:03:40,560 --> 00:03:43,600 What you should do now is tell me how many landed 73 00:03:43,600 --> 00:03:46,000 on this side of the line and how many landed 74 00:03:46,000 --> 00:03:47,840 on that side of the line. 75 00:03:47,840 --> 00:03:51,640 Three of them are on your left as you stand like that 76 00:03:51,640 --> 00:03:53,840 and two of them are over here. OK, right. 77 00:03:53,840 --> 00:03:56,960 You might think then that I should estimate the line 78 00:03:56,960 --> 00:03:59,440 is two fifths of the way along the table. 79 00:03:59,440 --> 00:04:02,240 Yes. That's what Fisher would say. Two fifths. 80 00:04:02,240 --> 00:04:04,040 Bayes would not say that. 81 00:04:04,040 --> 00:04:06,880 He would say it's three sevenths of the way along the table. 82 00:04:06,880 --> 00:04:09,080 But the data would only say two fifths 83 00:04:09,080 --> 00:04:11,480 and that's what Fisher would say just using the data. 84 00:04:11,480 --> 00:04:14,600 Bays would pull it a bit towards the middle and say it's there. 85 00:04:14,600 --> 00:04:17,920 And what's the difference between Fisher's approach and Bayes' approach? 86 00:04:17,920 --> 00:04:22,360 Fisher's approach, he will just use the information from the data alone. 87 00:04:22,360 --> 00:04:25,720 Whereas the Bayesian approach would use also the fact that I know 88 00:04:25,720 --> 00:04:29,120 that you threw that first ball at random to lie on this table. 89 00:04:29,120 --> 00:04:31,960 And that piece of information actually changes what I think. 90 00:04:31,960 --> 00:04:34,520 I'm going to tell you something actually. 91 00:04:34,520 --> 00:04:37,480 Because actually the answer was that I think it was about here, 92 00:04:37,480 --> 00:04:40,200 which is somewhere between two fifths and three sevenths. 93 00:04:40,200 --> 00:04:42,360 Between two fifths and three sevenths. 94 00:04:42,360 --> 00:04:44,720 But that was roughly where the ball was. 95 00:04:48,600 --> 00:04:54,080 How important is the work of Bayes and Fisher to the modern world? 96 00:04:54,080 --> 00:04:56,280 Bayesian ideas are everywhere. 97 00:04:56,280 --> 00:04:59,200 Your spam filter is probably a Bayesian spam filter. 98 00:04:59,200 --> 00:05:02,440 All sorts of image processing techniques, a huge amount 99 00:05:02,440 --> 00:05:06,480 of machine learning, algorithms, we've based on Bayesian methodology. 100 00:05:06,480 --> 00:05:10,240 And Fisherian methods, again, staggeringly important. 101 00:05:10,240 --> 00:05:12,400 Every scientific paper you read 102 00:05:12,400 --> 00:05:14,800 is going to have a p-value at the end of it. 103 00:05:14,800 --> 00:05:18,840 But it's all to do with how data changes our judgment, 104 00:05:18,840 --> 00:05:21,880 our knowledge, what we can learn from data. 105 00:05:21,880 --> 00:05:24,480 And that's what the modern world's about.