In the 1850s, upper royalty in England were living almost 70% longer than commoners. This graph shows average life expectancy for “ducal” families. These were the highest rank of nobility, the dukes, just below the King’s nuclear family in the royal hierarchy. These were the top 1% of the 1% highest elites in England.
The surprising thing is that these elites in England actually had a lower life expectancy than commoners for hundreds of years before the beginning of the scientific revolution in the mid 1700s despite the fact that the elites had better food, clothing, housing, and much less dangerous occupations. Their servants had to empty and clean their pots of “night soil” by hand every day and work in their mines and sail their ships and fight on the front lines in their wars. While commoners risked life an limb, the nobility was the leisure class.
There are two main theories why the richest nobility died earlier than commoners for centuries. One theory is a version of the sin of gluttony. Whereas the commoners expected to exist near the edge of starvation at times during their lives, the elites could afford to drink beer and wine instead of water. That helped the elites avoid the bacteriological diseases that plagued the commoners, but even fairly modest consumption of alcohol on a regular basis is shown to reduce life expectancy and large quantities of alcohol knocks off years of life. Elites were more likely to engage in other harmful habits like tobacco and opium too. The other theory is that only the elites could afford doctors and doctors’ interventions in the West were more deadly than doing nothing during this period.
Consider the fates of wartime rivals, King George III and George Washington. They were the most wealthy and powerful elites in England and America respectively. In 1788, King George III had a spasms of pain in his abdomen. His doctor had no idea what was causing it and gave him two powerful laxatives even though it could have been something like a kidney stone that has nothing to do with digestion. Then, fearing the laxatives were too strong and would cause too much diarrhea, the doctor gave him a tincture of opium which causes constipation and is an addictive sedative. The spasms continued and his behavior became erratic.
In November of that year, he became seriously deranged, sometimes speaking for many hours without pause, causing him to foam at the mouth and his voice to become hoarse. George would frequently repeat himself and write sentences with over 400 words at a time, and his vocabulary became “more complex, creative and colourful”
This is recounted in the movie, The Madness of King George. An examination of George’s hair in the 2000s gave an explanation for why he may have gone mad. His hair had arsenic levels that were seventeen times higher than the standard for arsenic poisoning and his physicians’ records show that his main treatment for his abdominal spasms was emetic tarter which was between 2 and 5 percent arsenic. An emetic is a medicine whose purpose is to cause vomiting which is also more likely to exacerbate abdominal spasms and pain as to relieve it. In addition, he was plastered with caustic poultices. A caustic is a chemical that is “able to burn or corrode organic tissue” and they would put thick poultices on his skin in order to create wounds that would ooze, blister, and bleed out supposed “evil humours” according to his famous doctors including Francis Willis.
Surprisingly, he survived his treatment that time around and made a full recovery. George Washington was not so lucky. Here is an account from William Rosen‘s book about the history of antibiotics:
By the time the sun had risen, Washington’s overseer, George Rawlins… had opened a vein in Washington’s arm from which he drained approximately twelve ounces of his employer’s blood. Over the course of the next ten hours, two other doctors—Dr. James Craik and Dr. Elisha Dick— bled Washington four more times, extracting as much as one hundred additional ounces. Removing at least 60 per cent of their patient’s total blood supply was only one of the curative tactics used by Washington’s doctors. The former president’s neck was coated with a paste composed of wax and beef fat mixed with an irritant made from the secretions of dried beetles, one powerful enough to raise blisters, which were then opened and drained, apparently in the belief that it would remove the disease-causing poisons. He gargled a mixture of molasses. vinegar, and butter; his legs and feet were covered with a poultice made from wheat bran; he was given an enema; and, just to be on the safe side, his doctors gave Washington a dose of calomel— mercurous chloride—as a purgative. Unsurprisingly, none of these therapeutic efforts worked.
Doctors commonly gave people poisons like mercury and arsenic and covered their skin with caustic chemicals to produce blisters and inflated their intestines full of chemicals pumped in through their anuses (an enema) to wash them out on the inside before draining them of blood like vampires until they lost consciousness. Here is an excerpt from Extra Life, Steven Johnson’s fantastic history of why people live twice as long today as any known population before the 1700s:
As late as the onset of World War I, William Osler, the founder of Johns Hopkins, advocated for bloodletting as a primary intervention for military men who came down with influenza and other illnesses: “To bleed at the very onset in robust, healthy individuals in whom the disease sets in with great intensity and high fever is, I believe, a good practice.”… If you happened to be a pharmacist in 1900 looking to stock your shelves with medicinal cures for various ailments—gout, perhaps, or indigestion—you would likely consult the extensive catalog of Parke, Davis & Company, now Parke-Davis, one of the most successful and well-regarded drug companies in the United States. In the pages of that catalog, you would have seen products like Damiana et Phosphorus cum Nux, which combined a psychedelic rub and strychnine to create a product designed to “revive sexual existence.” Another elixir …contained [deadly nightshade], arsenic, and mercury. Cocaine was sold in an injectable form, as well as in powders and cigarettes. The catalog proudly announced that the drug would “[take] the place of food, make the coward brave, the silent eloquent and … render the sufferer insensitive to pain.” As the medical historian William Rosen writes, “Virtually every page in the catalog of Parke, Davis medications included a compound as hazardous as dynamite, though far less useful.” …The historian John Barry notes that “the 1889 edition of the Merck Manual of Medical Information recommended one hundred treatments for bronchitis, each one with its fervent believers, yet the current editor of the manual recognizes that ‘none of them worked.’ The manual also recommended, among other things, champagne, strychnine, and nitroglycerin for seasickness.”
…A twenty-seven-year-old Tennessean named Samuel Evans Massengill dropped out of medical school to start his own drug company with the aim of producing a sulfa [antibiotic] variant that would be easier to consume. In 1937, the chief chemist Harold Watkins at the newly formed S. E. Massengill Company hit upon the idea of dissolving the drug in diethylene glycol, with raspberry flavoring added to make the concoction … more palatable to children. The company rushed the concoction to market under the brand Elixir Sulfanilamide, shipping 240 gallons of the medicine to pharmacies around the United States, promising a child-friendly cure for strep throat. While sulfa did in fact have meaningful antibacterial effects, and the raspberry flavoring added the proverbial spoonful of sugar, diethylene glcol is toxic to humans. Within weeks, six deaths were reported in Tulsa, Oklahoma, linked to the “elixir,” each one from kidney failure. The deaths triggered a frantic nationwide search, with agents from the Food and Drug agency poring over pharmacy records, alerting doctors, and warning anyone who had purchased the drug to immediately destroy it. But the FDA didn’t have enough pharmacological expertise on staff to determine what made the drug so lethal. And so they outsourced that detective work to …Eugene Ceiling. Within weeks, Ceiling had his entire team of graduate students testing all the ingredients of the elixir on a small menagerie of animals in the lab: dogs, mice, and rabbits. Ceiling quickly identified diethylene glycol—a close chemical relative of antifreeze—as the culprit… Many more [patients] had been hospitalized with severe kidney problems, narrowly avoiding death.
Massengill was ultimately lined $24,000 for selling the poison to unwitting consumers… [and] he declared. “I do not feel that there was any responsibility on our part.”
The pharmaceutical companies had no legal incentive to concoct elixirs that actually worked, given the limited oversight of the FDA [before 1938]. As long as their lists of ingredients were correct, they had free rein to sell whatever miracle potion they wanted. Even when one of those ingredients happened to be a known poison that killed 104 people, the penalty was only a financial slap on the wrist.
One might think that the market itself would provide adequate incentives for the pharma companies to produce effective medicines. Elixirs that actually cured the ailments they promised to cure would sell more than elixirs that were predicated on junk science. But the market mechanisms behind medical drugs were complicated by two factors that do not apply to most other consumer products. The first is the placebo effect. On average, human beings do tend to see improved health outcomes when they are told they are being given a useful medicine, even if the medicine they’re taking is a sugar pill. How placebos actually work is still not entirely understood, but the effect is real. There is no equivalent placebo effect for, say, televisions or shoes. If you go into business selling fake televisions, 20 percent of your customers are not going to somehow imagine fake television shows when they get their purchase back to their living rooms. But a pharma company selling fake elixirs will reliably get positive outcomes from a meaningful portion of its customers.
The other reason market incentives fail with medicine is that human beings have their own internal pharmacies in the form of their immune systems. Most of the time when people get sick, they get better on their own—thanks to the brilliant defense system of leukocytes, phagocytes, and lymphocytes that recognizes and fights off threats or injuries and repairs damage. As long as your magic elixir didn’t cause [massive amounts of] kidney failure, you could sell your concoction to consumers and most of the time they would indeed see results. Their strep throat would subside, or their fever would go down— not because they’d ingested some quack’s miracle formula but because their immune system was quietly, invisibly doing its job. From the patient’s point of view, however, the miracle formula deserved all the credit.
Figuring out what treatments cure people and which are shams is very hard. Bleeding sick people was still in American medical textbooks in the 1920s. Why have doctors mostly stopped bleeding their patients giving them poisons like mercury, arsenic, deadly nightshade, and strychnine? Statistics.
It takes statistics to be able to know what medicines are killing people and which are not because not everyone will die if you give them a modest dose of any poison. And not everyone will be cured if you give them a perfectly good medicine either. Unfortunately, some people will die no matter what doctors try. There is a large random element involved in disease that causes noise in the data. Figuring out cause and effect in the presence of uncertainty requires statistics and the more randomness and complexity, the harder it is to determine cause and effect.
It turns out that the best way to fight randomness is with more randomness: Randomized, Controlled Trials or RCTs are the gold standard for researching cause and effect. This is the most important reason why medicine works much better today than it did in the early 1900s
To do an RCT on a COVID treatment, you would get a large group of infected patients and randomly divide them into two groups, an “experimental group” that gets the treatment and a “control group” that gets a placebo treatment like a sugar pill or a saline injection that is known to have no effect on health. Then you see how much difference there is in the two groups. That is where the statistics finally come in because if there are 100 patients in each group and 4 die in the experimental group and 5 die in the control group, is that really a meaningful difference or is it more likely due to random chance? Statistics is all about dealing with that kind of uncertainty. In this particular example, although 25% more people died who didn’t get the treatment, this is not a statistically significant difference between the two groups so it just doesn’t teach us anything one way or the other except to know that the treatment is not highly effective at curing 100% of COVID cases! After all, there is only a difference of one person which is basically just an anecdote (a case study).
The reason for randomization is to remove bias. It eliminates selection bias in assigning who gets treatment. For example, the sickest COVID patients get the most treatment and they are the most likely to die, so if you just look at before and after treatment, a successful treatment might look like a failure. Or the group of people who volunteer for treatment might be healthier than those who do not volunteer. Only if you randomly divide a large population into two groups will the two groups be roughly similar and eliminate selection bias. But you have to have a large enough population for the randomness to work. If you randomly divide a population of six people, you are unlikely to get the same number of men and women in the two groups, but if you divide up 600 people, there is unlikely to be much difference between the two groups.
There can also be a treatment bias if patients and doctors know who is getting a treatment and who is not. Doctors might unconsciously help one group more than the other, or there could be a placebo effect in which the patients actually feel better because they are psychologically reassured by the treatment they are getting. That can be avoided by doing a “double blind” study in which neither patients nor doctors know which group of patients is getting the real treatment and which is getting the control.
Although the earliest record of an early kind of RCT was conducted by James Lind in 1747, it had almost zero impact on how science was conducted. Most people still preferred anecdotes (case studies) as evidence. Lind’s study successfully showed the British Navy that scurvy could be eliminated by eating citrus and British sailors became known as “limeys” as a result! But it was a primitive study that was too small, lacked a placebo and participants were not fully randomized.
Psychologists and education researchers started occasionally doing RCTs in the 1880s and agricultural researchers like R.A.Fisher began publishing in the 1930s. The first true RCT in medicine was not published until 1948!
RCTs were very rarely done until 1966 when the FDA began testing drugs for efficacy. As you can see in the chart below, that was revolutionary. Before that, nobody knew how well most treatments worked or if they really even worked at all. In many ways, that is the beginning of when medicine finally became scientific, and the growth of medical science has been exponential since then.
To quote another excerpt of Steven Johnson’s excellent book:
Of all the late arrivals of intellectual and technological history… the RCT may well be the most puzzling, …There are few methodological revolutions in the history of science as significant as the invention of the RCT. (Only the seventeenth-century formulation of the scientific method itself— building hypotheses, testing them, refining them based on feed- back from the rest—looms larger.) Like those empirical methods that Francis Bacon and other proto-Enlightenment scientists developed, the RCI is a surprisingly simple technique—so simple, in fact, that it begs the question of why it took so long for people to discover it… [The RCT is a] system for separating the quack cures from the real thing, one that avoids the many perils that had long bedeviled the science of medicine: anecdotal evidence, false positives, confirmation bias, and so on. When the FDA began demanding proof of efficacy from the drug manufacturers in 1962 [although the actual research didn’t come out until later as shown in the above graph], they could make that demand because a system was now in place—in the form of the RCT—that could meaningfully supply that kind of proof. The RCT emerged as a confluence of several distinct intellectual tributaries.
…The importance of randomization would not become apparent until the early twentieth [century], when the British statistician R. A. Fisher began exploring the concept in the context of agricultural studies as a way of testing the effectiveness of treatments on distinct plots of land. “Randomization properly carried out,” Fisher argued in his 1935 book, The Design of Experiments, “relieves the experimenter from the anxiety of considering and estimating the magnitude of the innumerable causes by which his data may be disturbed.” Fisher’s work on randomization and experiment design in the 1930s caught the eye of an epidemiologist and statistician named Austin Bradford Hill, who sensed in Fisher’s method a technique that could prove hugely beneficial for medical studies.
Hill would later echo Fisher’s description of the powers of randomization, writing that the technique “ensures that neither our personal idiosyncrasies (our likes or dislikes consciously or unwittingly applied) nor our lack of balanced judgement has entered into the construction of the different treatment groups—the allocation has been outside our control and the groups are therefore unbiased.” Hill recognized that the key to successful experiment design was … to remove his or her influence over the results of the experiment, the subtle contaminations that so often distorted the data.
As a young man, Hill had contracted tuberculosis while serving as a pilot in the Mediterranean, and so it was somewhat fitting that the first Landmark study that Hill oversaw “as investigating a new treatment for tuberculosis, the experimental antibiotic streptomycin. …the real significance of the study lay in its form. It is now widely considered to be the first genuine RCT ever conducted [in 1948]. Antibiotics… turned out to be the prime movers that finally transformed the world of medicine into a net positive force in terms of life expectancy. It is likely not a coincidence that the first true miracle drugs and the first true RCTs were developed within a few years of one another.
The two developments complemented each other: the discovery of antibiotics finally gave the re searchers a drug worth testing, and the RCTs gave them a quick and reliable way to separate the promising antibiotics from the duds. Hill’s randomized, controlled investigation into the efficacy of streptomycin was a milestone in the history of experiment design. Its indirect effect on health outcomes …would have earned him a place in the pantheon of medical history had he never published another paper. But Austin Bradford Hill was just getting started. His next study would have a direct impact on millions [more] lives across the planet.
…another kind of killer was growing increasingly deadly across the population: lung cancer. The surge in deaths was truly alarming. By the end of the war, the Medical Research Council estimated that mortality from carcinoma of the lung had increased fifteen-fold from 1922. Cigarettes were one of the suspected causes, but many people pointed to other environmental causes: the exhaust from automobiles, the use of tar in roadways, other forms of industrial pollution.
…The Medical Research Council approached Hill and another noted epidemiologist named Richard Doll, asking the two men to investigate the lung cancer crisis. Today of course, even grade-schoolers are aware of the connection between smoking and lung cancer—even if some of them grow up to ignore it—but in the late 1940s, the link was not at all clear. “I myself did not expect to find smoking was a major problem,” Richard Doll would later recall. “If I’d had to bet money at that time, I would have put it on something to do with the roads and motorcars.” Hill and Doll devised a brilliant experiment to test the hypothesis that smoking might be connected to the surge in lung cancer cases.
The structure was a kind of inverted version of a traditional drug trial. The experimental group was not given an experimental medicine, and there was no placebo. Instead, the experimental group was made up of people with existing cases of lung cancer. Hill and Doll approached twenty different London hospitals to find [enough lung cancer patients to get] a statistically meaningful group… They then recruited … control groups at each hospital…. For each member of the “experimental” group—that is, the group with lung cancer—they tried to match with a control patient who was roughly the same age and economic class, and who lived in the same neighborhood or town. With those variables the same in each group, Hill and Doll ensured that some confounding factor wouldn’t contaminate the results. Imagine, for instance, that the lung cancer surge turned out to he caused by the industrial soot in Lancashire factories. An experiment that didn’t control for place of residence or economic status (factory worker versus salesclerk, say) wouldn’t be able to detect that causal link. But by assembling an experimental group and a control group that were broadly similar to each other [demographically], Hill and Doll could investigate whether there was a meaningful difference between the two groups in terms of smoking habits.
In the end, 709 people with lung cancer were interviewed about their smoking history, with the same number in the control group. Hill and Doll … explored those histories along different dimensions: average cigarettes smoked per day; total tobacco consumed over one’s lifetime; age when the subject began smoking. Once the numbers had been crunched, the results were overwhelming. “Whichever measure of smoking is taken,” Hill and Doll wrote, “the same result is obtained—namely, a significant and clear relationship between smoking and carcinoma of the lung.”°
At the end of the paper … Hill and Doll made a rough attempt to evaluate the impact of heavy smoking on the probability of contracting lung cancer. By their estimate, a person who smoked more than a pack a day was fifty times more likely to develop lung cancer than a nonsmoker. The number was shocking at the time, but we now know it to have been a wild understatement of the risk. Heavy smokers are in fact closer to five hundred times more likely to develop lung cancer than nonsmokers.
Despite the overwhelming evidence the study conveyed, and the rigor of its experimental design, the 1950 paper they published …was initially dismissed by the medical establishment. Years later, Doll was asked why so many authorities ignored the obvious evidence that he and Hill had accumulated. “One of the problems we found in trying to convince the scientific community,” he explained, “was that thinking at that time was dominated by the discovery of bacteria such as diphtheria, typhoid, and the tubercle, which had been the basis for the big advances in medicine in the last decades of the nineteenth century.
When it came to drawing conclusions from an epidemiology study, scientists tended to use the rules that had been used to show that a particular germ was the cause of an infectious disease.” In a sense, the medical establishment had been blinded by its own success identifying the causes of other diseases. While an overwhelming number of lung cancer patients had turned out to be heavy smokers, there were still a number of nonsmokers who had suffered from the disease. Using the old paradigm, those nonsmokers were like finding a cholera patient who had never ingested the Vibrio cholerae bacterium. “But, of course, nobody was saying [smoking] was the cause; what we were saying is that it is a cause,” Doll explained. “People didn’t realize that these chronic diseases could have multiple causes.”
Undeterred, Hill and Doll set out to conduct another experiment…. They decided to see if they could predict cases of lung cancer by analyzing people’s cigarette use and health outcomes over many years. This time they used physicians themselves as the subjects, sending out questionnaires to more than lift; thousand doctors in the United Kingdom, interviewing them about their own smoking habits and then tracking their health over time. “We planned to do the study for five years,” Doll later recalled. “But within two and a half years. we already had 37 deaths from lung cancer [among smokers] and none in nonsmokers.” They published their results early, in 1954, in what is now considered a watershed moment in the scientific establishment’s understanding of the causal link between smoking and cancer.
In that 1954 paper, the experiment design proved to be less important than the unusual choice of subjects. Hill and Doll had originally decided to interview physicians because it was easier to follow up with them to track their health and cigarette use over the ensuing years. But the decision proved to have additional benefits. “It turned out to have been very fortunate to have chosen doctors, from a number of points of view,” Doll noted. “One was that the medical profession in this country became convinced of the findings quicker than anywhere else. They said, ‘Goodness Smoking kills doctors, it must be very serious.'” Exactly ten years after the publication of Hill and Doll’s second investigation…, the United States [Surgeon General,] Luther Terry, a physician, famously issued his Report on the Health Consequences of Smoking, which officially declared that cigarettes posed a significant health threat. (After nervously puffing on a cigarette on the way to the announcement, Terry was asked during the press conference whether he himself was a smoker. “No,” he replied. When asked how long it had been since lie had quit, he replied, “Twenty minutes.”) …When Hill and Doll interviewed their first patients in the London hospitals, more than 50 percent of the UK population were active smokers. Today the number is just 16 percent. Quitting smoking before the age of thirty-five is now estimated to extend your life expectancy by as much as nine years.
It is a fantastic book and you should buy it and read the whole thing! Without statistical methods in general and RCTs in particular, medical care would have had caused very benefit in the last half century. Of course, in earlier times more basic parts of the scientific method were more important than statistics when there were such big effects that they were obvious without using statistics. For example, sanitation, pasteurization, and water purification had such big, immediate effects that RCTs weren’t needed. But without RCTs, we would have no COVID vaccines today and no idea what works for treating COVID.