The Just Deserts of Capitalism and the Giant Turnip

≈35 minutes to read.

Why do some people earn more money than others? The overly simplistic answer from neoclassical economics is that people earn what they produce. If a doctor earns four times as much as her pastor it is because the doctor is producing four times more benefit for society. CEOs use this story to justify their large salaries. They say that they are entitled to billions of dollars because, in capitalist markets, people earn what they produce. President Bush’s chief economist, Gregory Mankiw calls this idea “just deserts” in which “desert” means “entitlement” and “just” means social justice. Mankiw thinks this theory justifies vast inequality, and that the real social injustice in America is that rich people pay too much tax because they should be entitled to keep what they produce.

But reality is not so simple because what determines income in capitalism is only tangentially related to productivity. Bargaining power determines wages. Period. Productivity is also important, but it is merely one of several factors that help determine bargaining power. If everything else were the same, higher productivity people would tend to get paid higher wages, but everything else is not the same and bargaining power is also determined by managerial hierarchy, government institutions, opportunity costs, personal preferences, and persuasive abilities.

Productivity would solely determine wages for a single person stranded on a deserted island, but most people don’t work like hermits. We are social animals and when people are working together on anything, the idea that productivity alone determines wages falls apart. Consider the parable of the Giant Turnip.

In this classic tale, every year a family produces a giant turnip in the garden. It is big enough to feed them all winter long, but when grandpa goes out to harvest it, he cannot pull it out so he asks grandma to help and they still cannot get it so they ask granddaughter too, but it still doesn’t budge. They ask the dog and the cat and they still cannot get the turnip out. Finally, they ask a mouse who pulls on the cat who pulls on the dog who pulls on granddaughter who pulls on grandma who pulls on grandpa who pulls on the giant turnip and POP! With the help of the mouse, the turnip finally comes out of the ground so they can bring it inside where it won’t be ruined by the upcoming freeze.

What is the marginal product of each person? Grandpa? Granddaughter? What is the marginal product of the mouse?

You could say that they all have a marginal product of zero because, by themselves, none of them could harvest the giant turnip. Standard marginal productivity theory would say that each of them has a marginal product equal to the entire value of the turnip because if any of them had refused to cooperate, they could not have harvested it. No matter how you define marginal product, it provides zero guidance over how much of the turnip each worker should get paid because their work is interdependent. Without a team, they produce nothing.

Read more ›

Posted in Inequality, Labor

The end of the pandemic is in sight

All pandemics come to an end.  All pandemics feel endless at their peak, but then they end just as suddenly as they began. Today the Covid case count is finally clearly down.  So Omicron has probably peaked in the USA nationally and it has already definitely been plummeting in the cities that it hit earliest like New York and Chicago.

Omicron is now nearly100% of US Covid cases and for boosted people it has been considerably less deadly than the ordinary flu, but right now there is much, much higher transmission rates right now so overall risk is currently higher than a bad flu season, but in a month from now the transmission rates will be like an ordinary flu season so we’ll basically be done with the pandemic unless a new variant evolves that evades prior immunity.  That gets less likely with every new wave of contagion so I’m optimistic that this is the last gasp of the pandemic. 

In retrospective, we’ve done amazingly well.  The economy and mortality took a much smaller hit than the 1918 pandemic which took three years to end and caused the biggest recession you’ve never heard of because it was overshadowed in the history books by the end of WWI and the pandemic itself.*  The only global recessions that were worse since then were the Great Depression (particularly in the US which suffered worse than average) and WWII which was a disaster globally, but an economic boom in the US.  Robert Barro and Jose Ursua’s research suggests that the flu pandemic caused a 12% drop in US GDP and a 6% to 8% fall in worldwide GDP between 1919 and 1921.  The coronavirus pandemic caused the deepest recession since the Great Depression in the US which was followed by the fastest recovery ever measured which has been so strong that we are now dealing with modest inflation and worker shortages!  That is a much better problem to have than a long stagnation and high unemployment.

This time we developed numerous new treatments in record time using new research that has changed science forever.  In particular, the mRNA vaccines were developed in days, tested starting in months and mass-manufactured in less than a year.  Nobody would have predicted that in 2018 because it couldn’t have happened in 2018 — the mRNA science wasn’t ready yet.  So the timing was incredibly lucky. The fastest previous vaccine took four years just for development.  And after that hurdle, vaccine manufacturing used to be very slow because it meant infecting living cells (sometimes chicken eggs) with a version of the virus that causes it to manufacture more virus and then extracting the virus and preserving just enough of it to cause an immune response but not the full illness.  That manufacturing process is hard to scale up rapidly compared with mRNA vaccines although the new adenovirus technique also made traditional vaccine manufacturing much faster too. Because the US led in mRNA technology, the US had the best vaccine production in the world. 

The story how the new mRNA vaccine science came together is really worth reading.  It was a series of happy accidents, many coming out of the failed efforts to develop an AIDS vaccine that combined seemingly unconnected research and all unexpectedly clicked into place last year and in less than two years after Covid-19 was first named, over 10 billion doses have been administered worldwide.  That is more than enough to give everyone in the world a dose, but of course, richer people have had three doses which means that poorer people mostly have none, but vaccine production has ramped up considerably over the past year and, we could easily see 20 billion more doses over the next year if there is the money to do it.  

76% of Americans have had at least one vaccination and that is already close enough to herd immunity to dramatically blunt the effects of the Omicron wave.  Omicron is delivering a ‘vaccination’ dose to most of the 24% of Americans who haven’t gotten a shot yet and a giving booster to lots of vaccinated Americans.  By March, the majority of Americans will undoubtedly have had Covid at least once.** 

The peak of US Covid hospitalization should come in about another week and the last peak of mortality another week or two after that, so we aren’t done yet, but I think we can start planning our spring Covid parties to celebrate the end of the pandemic. Although the infection rate is terrible now, it will just get safer and safer.  We just gotta get through a few more weeks of deepest darkness before the dawn.  After that, the virus will still always be with us, but it won’t be any more worrisome than the many other endemic viruses that we always have.  They too caused pandemics in the past.

Of course another variant could cause another spike at some point, but if so, then we will be even more prepared for it when it comes along.  I’m betting that we are already in the last pandemic wave because the next variant won’t be nearly as problematic as any of the pandemic waves so far and we’ll manage it more like we ordinarily manage a bad flu season.

Read more ›

Posted in Health

The age of evidence

I am teaching statistics this semester and one of the striking things about a lot of statistics is how new it is. The standard deviation was introduced in 1892. The t-distribution was published in 1908. The term “central tendency” dates from the late 1920s. The randomized controlled trial began development in the 1920s, and the first modern RCT wasn’t published until 1948. The box-plot wasn’t invented until 1970!

It is amazing how young the scientific method it and how much it continues to change. The natural sciences first adopted experimental methods in the 1600s with Galileo Galilei and Sir Isaac Newton being archetypal examples. But if you didn’t know the future, you wouldn’t have predicted that evidence would become mainstream. Even these two archetypal examples didn’t portend that well for science since Newton was a fervent believer in the occult and alchemy and Galileo was famously sent to prison for doing science. But the scientific method gradually took over the physical sciences like physics and chemistry and gradually expanded into the medical and social sciences too.

Evidence-based medicine first started to become mainstream in the early 1900s although scientific improvements have been remarkably gradual.  The term “evidence-based medicine” wasn’t coined until 1990 or 1992 (depending on the claim) and we still have a long way to go. According to one study, only about 18% of the decisions physicians make are evidence based.  But science has been accelerating:

  • Randomized-Controlled Trials (RCTs) first became commonplace in the 1960s and have grown exponentially.
  • Laboratory experiments in economics started getting common in the 1950s and field experiments (RCTs) took off in the 1990s, winning the Nobel Prize in 2019.
  • “Moneyball” sports analytics took off in the 1990s.
  • Evidence-based management becoming a trending topic in the 2000s.
  • The effective altruism movement to promote evidence-based philanthropy began to coalesce in about 2011 which is when they first started using the term “effective altruism”.
  • Meta-analysis was rare before the 1990s and grew exponentially in the 2000s as shown below.
  • The replication crisis in the 2010s was the application of meta-science to study scientific methodology beginning in Psychology and it is revolutionizing psychological research. The other social sciences and medicine are grappling with similar problems of flawed research methodology and should lead to similar improvements in knowledge.

In my previous post, I showed some of the glorious story of RCTs, but meta-science proves that RCTs vary widely in quality. As a result, two different RCTs studying the same thing often produce different conclusions. To figure out how to deal with contradictory RCTs, statisticians can combine the results of multiple RCTs into meta-analyses which also weigh each RCT (or observational study) according to its quality:

The term meta-analysis was coined in 1976 by statistician Gene Glass …who described it as “an analysis of analyses.” Glass, who worked in education psychology, had undergone a psychoanalytic treatment and found it to work very well; he was annoyed by critics of psychoanalysis, including Hans Eysenck, a famous psychologist at King’s College London, who Glass said was cherry picking studies to show that psychoanalysis wasn’t effective, whereas behavior therapy was.

At the time, most literature reviews took a narrative approach; a prominent scientist would walk the reader through their selection of available studies and draw conclusions at the end. Glass introduced the concept of a systematic review, in which the literature is scoured using predefined search and selection criteria. Papers that don’t meet those criteria are tossed out; the remaining ones are screened and the key data are extracted. If the process yields enough reasonably similar quantitative data, the reviewer can do the actual meta-analysis, a combined analysis in which the studies’ effect sizes are weighed.

Today, meta-analyses are a growth industry. Their number has shot up from fewer than 1000 in the year 2000 to some 11,000 [in 2017]. The increase was most pronounced in China, which now accounts for about one-third of all meta-analyses. Metaresearcher John Ioannidis of Stanford University in Palo Alto, California, has suggested meta-analyses may be so popular because they can be done with little or no money, are publishable in high-impact journals, and are often cited.

Yet they are less authoritative than they seem, in part because of what methodologists call “many researcher degrees of freedom.” “Scientists have to make several decisions and judgment calls that influence the outcome of a meta-analysis,” says Jos Kleijnen, founder of the company Kleijnen Systematic Reviews in Escrick, U.K. They can include or exclude certain study types, limit the time period, include only English-language publications or peer-reviewed papers, and apply strict or loose study quality criteria, for instance. “All these steps have a certain degree of subjectivity,” Kleijnen says. “Anyone who wants to manipulate has endless possibilities.”

His company analyzed 7212 systematic reviews and concluded that when Cochrane reviews were set aside, only 27% of the meta-analyses had a “low risk of bias.” Among Cochrane reviews [which has the highest reputation], 87% were at low risk of bias…

Money is one potential source of bias… In 2006, for instance, the Nordic Cochrane Centre in Copenhagen compared Cochrane meta-analyses of drug efficacy, which are never funded by the industry, with those produced by other groups. It found that seven industry-funded reviews all had conclusions that recommended the drug without reservations; none of the Cochrane analyses of the same drugs did. Industry-funded systematic reviews also tended to be less transparent. Ioannidis found in a 2016 review that industry-sponsored meta-analyses of antidepressant efficacy almost never mentioned caveats about the drugs in their abstracts.

Unfortunately, the quality of meta-analysis can also vary widely, so you cannot trust a meta-analysis any more than any original research if you do not know about the quality of their methodology. Fortunately, it is easy to find high-quality meta analysis for medical research because The Cochrane Review is the best place for meta-analysis about the state of the art. Sometimes they are a little out of date, but nobody does it better.

Posted in Health, statistics

Why we have Extra Life today.

In the 1850s, upper royalty in England were living almost 70% longer than commoners. This graph shows average life expectancy for “ducal” families.  These were the highest nobility, the dukes, who ranked just below the King’s nuclear family in the royal hierarchy.  These were the top 1% of the 1% highest elites of England.

Surprisingly, the English elites had a lower life expectancy than commoners for hundreds of years despite having MUCH better food, clothing, housing, and much less dangerous occupations. Their servants had to empty and clean their pots of “night soil” by hand every day and work in their mines and sail their ships and fight on the front lines in their wars. While commoners risked life an limb, the nobility was the leisure class. 

The richest nobility died earlier than commoners for centuries until elite life expectancy began rising in the mid 1700s.  The main theory for why is the scientific revolution that began then.  They just didn’t know what was healthy.  For example, whereas the commoners expected to exist near the edge of starvation at times during their lives, the elites could afford to drink beer and wine instead of water. That helped the elites avoid the bacteriological diseases that plagued the commoners, but even fairly modest consumption of alcohol on a regular basis is shown to reduce life expectancy and large quantities of alcohol knocks off years of life.

Probably a bigger factor was access to medical doctors.  Only the elites could afford doctors whose interventions were more deadly than doing nothing.

Consider the fates of the two rivals, King George III and George Washington. They were the wealthiest and most powerful elites in all of England and America. In 1788, King George III had a spasms of pain in his abdomen. His doctor had no idea what was causing it and gave him two powerful laxatives even though they had no idea if he was constipated. Then, fearing the laxatives were too strong and would cause too much diarrhea, the doctor gave him opium, an addictive sedative which causes constipation. But the spasms continued and his behavior became erratic.

In November of that year, he became seriously deranged, sometimes speaking for many hours without pause, causing him to foam at the mouth and his voice to become hoarse. George would frequently repeat himself and write sentences with over 400 words at a time, and his vocabulary became “more complex, creative and colourful”

This is recounted in the movie, The Madness of King George. An examination of George’s hair in the 2000s revealed why he may have gone mad. It demonstrated arsenic levels that were seventeen times higher than the standard for arsenic poisoning.  His physicians’ records also confirm that the main treatment for his abdominal spasms was emetic tarter which contained between 2 and 5 percent arsenic. An emetic is a medicine whose purpose is to cause vomiting which is also likely to have exacerbated his abdominal spasms and pain. In addition, he was plastered with caustic poultices. A caustic is a chemical that is “able to burn or corrode organic tissue.” His doctors, who were considered the best in the country, put caustic poultices on him that ate away his skin to intentionally create wounds that would ooze, blister, and bleed out “evil humours.” 

Surprisingly, he survived his treatment and made a full recovery (that time). George Washington was not so lucky. Here is an account from William Rosen‘s book about the history of antibiotics:

By the time the sun had risen, Washington’s overseer, George Rawlins… had opened a vein in Washington’s arm from which he drained approximately twelve ounces of his employer’s blood. Over the course of the next ten hours, two other doctors—Dr. James Craik and Dr. Elisha Dick— bled Washington four more times, extracting as much as one hundred additional ounces. Removing at least 60 per cent of their patient’s total blood supply was only one of the curative tactics used by Washington’s doctors. The former president’s neck was coated with a paste composed of wax and beef fat mixed with an irritant made from the secretions of dried beetles, one powerful enough to raise blisters, which were then opened and drained, apparently in the belief that it would remove the disease-causing poisons. He gargled a mixture of molasses. vinegar, and butter; his legs and feet were covered with a poultice made from wheat bran; he was given an enema; and, just to be on the safe side, his doctors gave Washington a dose of calomel— mercurous chloride—as a purgative. Unsurprisingly, none of these therapeutic efforts worked.

Doctors commonly gave people poisons like mercury and arsenic and covered their skin with caustic chemicals to produce blisters and inflated their intestines full of chemicals pumped in through their anuses (an enema) to wash out their insides before draining them of blood like vampires until they lost consciousness. Here is an expensive engraving from 1747 showing the most popular way to cure a drowning victim by inflating their intestines with tobacco smoke.  


Steven Johnson wrote Extra Life, a fantastic history of why most people live twice as long today as any known population before the 1700s:

As late as the onset of World War I, William Osler, the founder of Johns Hopkins, advocated for bloodletting as a primary intervention for military men who came down with influenza and other illnesses: “To bleed at the very onset in robust, healthy individuals in whom the disease sets in with great intensity and high fever is, I believe, a good practice.”… If you happened to be a pharmacist in 1900 looking to stock your shelves with medicinal cures for various ailments—gout, perhaps, or indigestion—you would likely consult the extensive catalog of Parke, Davis & Company, now Parke-Davis, one of the most successful and well-regarded drug companies in the United States. In the pages of that catalog, you would have seen products like Damiana et Phosphorus cum Nux, which combined a psychedelic rub and strychnine to create a product designed to “revive sexual existence.” Another elixir …contained [deadly nightshade], arsenic, and mercury. Cocaine was sold in an injectable form, as well as in powders and cigarettes. The catalog proudly announced that the drug would “[take] the place of food, make the coward brave, the silent eloquent and … render the sufferer insensitive to pain.” As the medical historian William Rosen writes, “Virtually every page in the catalog of Parke, Davis medications included a compound as hazardous as dynamite, though far less useful.” …The historian John Barry notes that “the 1889 edition of the Merck Manual of Medical Information recommended one hundred treatments for bronchitis, each one with its fervent believers, yet the current editor of the manual recognizes that ‘none of them worked.’ The manual also recommended, among other things, champagne, strychnine, and nitroglycerin for seasickness.”

…A twenty-seven-year-old Tennessean named Samuel Evans Massengill dropped out of medical school to start his own drug company with the aim of producing a sulfa [antibiotic] variant that would be easier to consume. In 1937, the chief chemist Harold Watkins at the newly formed S. E. Massengill Company hit upon the idea of dissolving the drug in diethylene glycol, with raspberry flavoring added to make the concoction … more palatable to children. The company rushed the concoction to market under the brand Elixir Sulfanilamide, shipping 240 gallons of the medicine to pharmacies around the United States, promising a child-friendly cure for strep throat. While sulfa did in fact have meaningful antibacterial effects, and the raspberry flavoring added the proverbial spoonful of sugar, diethylene glcol is toxic to humans. Within weeks, six deaths were reported in Tulsa, Oklahoma, linked to the “elixir,” each one from kidney failure. The deaths triggered a frantic nationwide search, with agents from the Food and Drug agency poring over pharmacy records, alerting doctors, and warning anyone who had purchased the drug to immediately destroy it. But the FDA didn’t have enough pharmacological expertise on staff to determine what made the drug so lethal. And so they outsourced that detective work to …Eugene Ceiling. Within weeks, Ceiling had his entire team of graduate students testing all the ingredients of the elixir on a small menagerie of animals in the lab: dogs, mice, and rabbits. Ceiling quickly identified diethylene glycol—a close chemical relative of antifreeze—as the culprit… Many more [patients] had been hospitalized with severe kidney problems, narrowly avoiding death.

Massengill was ultimately lined $24,000 for selling the poison to unwitting consumers… [and] he declared. “I do not feel that there was any responsibility on our part.”

The pharmaceutical companies had no legal incentive to concoct elixirs that actually worked, given the limited oversight of the FDA [before 1938]. As long as their lists of ingredients were correct, they had free rein to sell whatever miracle potion they wanted. Even when one of those ingredients happened to be a known poison that killed 104 people, the penalty was only a financial slap on the wrist.

One might think that the market itself would provide adequate incentives for the pharma companies to produce effective medicines. Elixirs that actually cured the ailments they promised to cure would sell more than elixirs that were predicated on junk science. But the market mechanisms behind medical drugs were complicated by two factors that do not apply to most other consumer products. The first is the placebo effect. On average, human beings do tend to see improved health outcomes when they are told they are being given a useful medicine, even if the medicine they’re taking is a sugar pill. How placebos actually work is still not entirely understood, but the effect is real. There is no equivalent placebo effect for, say, televisions or shoes. If you go into business selling fake televisions, 20 percent of your customers are not going to somehow imagine fake television shows when they get their purchase back to their living rooms. But a pharma company selling fake elixirs will reliably get positive outcomes from a meaningful portion of its customers.

The other reason market incentives fail with medicine is that human beings have their own internal pharmacies in the form of their immune systems. Most of the time when people get sick, they get better on their own—thanks to the brilliant defense system of leukocytes, phagocytes, and lymphocytes that recognizes and fights off threats or injuries and repairs damage. As long as your magic elixir didn’t cause [massive amounts of] kidney failure, you could sell your concoction to consumers and most of the time they would indeed see results. Their strep throat would subside, or their fever would go down— not because they’d ingested some quack’s miracle formula but because their immune system was quietly, invisibly doing its job. From the patient’s point of view, however, the miracle formula deserved all the credit.

Figuring out what treatments cure people and which are shams is very hard. Bleeding sick people was still in American medical textbooks in the 1920s. Why have doctors mostly stopped bleeding their patients giving them poisons like mercury, arsenic, deadly nightshade, and strychnine? Statistics.

It takes statistics to be able to know what medicines are killing people and which are not because not everyone will die if you give them a modest dose of any poison. And even the best medicine cannot always cure everyone. Unfortunately, some people will die no matter what doctors try.  So there is a large random element affecting diseases that causes noise in the data. Figuring out cause and effect in the presence of uncertainty requires statistics and the more randomness and complexity, the harder it is to determine cause and effect.

It turns out that the best way to fight randomness is with more randomness: Randomized, Controlled Trials or RCTs are the gold standard for researching cause and effect. This is the most important reason why medicine works much better today than it did in the early 1900s

To do an RCT on a COVID treatment, you would get a large group of infected patients and randomly divide them into two groups, an “experimental group” that gets the treatment and a “control group” that gets a placebo treatment like a sugar pill or a saline injection that is known to have no effect on health. Then you see how much difference there is in the two groups. That is where the statistics finally come in because if there are 100 patients in each group and 4 die in the experimental group and 5 die in the control group, is that really a meaningful difference or is it more likely due to random chance? Statistics is all about dealing with that kind of uncertainty. In this particular example, although 25% more people died who didn’t get the treatment, this is not a statistically significant difference between the two groups so it just doesn’t teach us anything one way or the other except to know that the treatment is not highly effective at curing 100% of COVID cases! After all, there is only a difference of one person which is basically just an anecdote (a case study).

The reason for randomization is to remove bias. It eliminates selection bias in assigning who gets treatment. For example, the sickest COVID patients get the most treatment and they are the most likely to die, so if you just look at before and after treatment, a successful treatment might look like a failure. Or the group of people who volunteer for treatment might be healthier than those who do not volunteer. Only if you randomly divide a large population into two groups will the two groups be roughly similar and eliminate selection bias. But you have to have a large enough population for the randomness to work. If you randomly divide a population of six people, you are unlikely to get the same number of men and women in the two groups, but if you divide up 600 people, there is unlikely to be much difference between the two groups.

There can also be a treatment bias if patients and doctors know who is getting a treatment and who is not. Doctors might unconsciously help one group more than the other, or there could be a placebo effect in which the patients actually feel better because they are psychologically reassured by the treatment they are getting. That can be avoided by doing a “double blind” study in which neither patients nor doctors know which group of patients is getting the real treatment and which is getting the control.

Although the earliest record of an early kind of RCT was conducted by James Lind in 1747, it had almost zero impact on how science was conducted. Most people still preferred anecdotes (case studies) as evidence. Lind’s study successfully showed the British Navy that scurvy could be eliminated by eating citrus and British sailors became known as “limeys” as a result! But it was a primitive study that was too small, lacked a placebo and participants were not fully randomized.

Psychologists and education researchers started occasionally doing RCTs in the 1880s and agricultural researchers like R.A.Fisher began publishing in the 1930s. The first true RCT in medicine was not published until 1948!

RCTs were very rarely done until 1966 when the FDA began testing drugs for efficacy. As you can see in the chart below, that was revolutionary. Before that, nobody knew how well most treatments worked or if they really even worked at all. In many ways, that is the beginning of when medicine finally became scientific, and the growth of medical science has been exponential since then.

To quote another excerpt of Steven Johnson’s excellent book:

Of all the late arrivals of intellectual and technological history… the RCT may well be the most puzzling, …There are few methodological revolutions in the history of science as significant as the invention of the RCT. (Only the seventeenth-century formulation of the scientific method itself— building hypotheses, testing them, refining them based on feed- back from the rest—looms larger.) Like those empirical methods that Francis Bacon and other proto-Enlightenment scientists developed, the RCI is a surprisingly simple technique—so simple, in fact, that it begs the question of why it took so long for people to discover it… [The RCT is a] system for separating the quack cures from the real thing, one that avoids the many perils that had long bedeviled the science of medicine: anecdotal evidence, false positives, confirmation bias, and so on. When the FDA began demanding proof of efficacy from the drug manufacturers in 1962 [although the actual research didn’t come out until later as shown in the above graph], they could make that demand because a system was now in place—in the form of the RCT—that could meaningfully supply that kind of proof. The RCT emerged as a confluence of several distinct intellectual tributaries.

…The importance of randomization would not become apparent until the early twentieth [century], when the British statistician R. A. Fisher began exploring the concept in the context of agricultural studies as a way of testing the effectiveness of treatments on distinct plots of land. “Randomization properly carried out,” Fisher argued in his 1935 book, The Design of Experiments, “relieves the experimenter from the anxiety of considering and estimating the magnitude of the innumerable causes by which his data may be disturbed.” Fisher’s work on randomization and experiment design in the 1930s caught the eye of an epidemiologist and statistician named Austin Bradford Hill, who sensed in Fisher’s method a technique that could prove hugely beneficial for medical studies.

Hill would later echo Fisher’s description of the powers of randomization, writing that the technique “ensures that neither our personal idiosyncrasies (our likes or dislikes consciously or unwittingly applied) nor our lack of balanced judgement has entered into the construction of the different treatment groups—the allocation has been outside our control and the groups are therefore unbiased.” Hill recognized that the key to successful experiment design was … to remove his or her influence over the results of the experiment, the subtle contaminations that so often distorted the data.

As a young man, Hill had contracted tuberculosis while serving as a pilot in the Mediterranean, and so it was somewhat fitting that the first Landmark study that Hill oversaw “as investigating a new treatment for tuberculosis, the experimental antibiotic streptomycin. …the real significance of the study lay in its form. It is now widely considered to be the first genuine RCT ever conducted [in 1948]. Antibiotics… turned out to be the prime movers that finally transformed the world of medicine into a net positive force in terms of life expectancy. It is likely not a coincidence that the first true miracle drugs and the first true RCTs were developed within a few years of one another.

The two developments complemented each other: the discovery of antibiotics finally gave the re searchers a drug worth testing, and the RCTs gave them a quick and reliable way to separate the promising antibiotics from the duds. Hill’s randomized, controlled investigation into the efficacy of streptomycin was a milestone in the history of experiment design. Its indirect effect on health outcomes …would have earned him a place in the pantheon of medical history had he never published another paper. But Austin Bradford Hill was just getting started. His next study would have a direct impact on millions [more] lives across the planet.

…another kind of killer was growing increasingly deadly across the population: lung cancer. The surge in deaths was truly alarming. By the end of the war, the Medical Research Council estimated that mortality from carcinoma of the lung had increased fifteen-fold from 1922. Cigarettes were one of the suspected causes, but many people pointed to other environmental causes: the exhaust from automobiles, the use of tar in roadways, other forms of industrial pollution.

…The Medical Research Council approached Hill and another noted epidemiologist named Richard Doll, asking the two men to investigate the lung cancer crisis. Today of course, even grade-schoolers are aware of the connection between smoking and lung cancer—even if some of them grow up to ignore it—but in the late 1940s, the link was not at all clear. “I myself did not expect to find smoking was a major problem,” Richard Doll would later recall. “If I’d had to bet money at that time, I would have put it on something to do with the roads and motorcars.” Hill and Doll devised a brilliant experiment to test the hypothesis that smoking might be connected to the surge in lung cancer cases.

The structure was a kind of inverted version of a traditional drug trial. The experimental group was not given an experimental medicine, and there was no placebo. Instead, the experimental group was made up of people with existing cases of lung cancer. Hill and Doll approached twenty different London hospitals to find [enough lung cancer patients to get] a statistically meaningful group… They then recruited … control groups at each hospital…. For each member of the “experimental” group—that is, the group with lung cancer—they tried to match with a control patient who was roughly the same age and economic class, and who lived in the same neighborhood or town. With those variables the same in each group, Hill and Doll ensured that some confounding factor wouldn’t contaminate the results. Imagine, for instance, that the lung cancer surge turned out to he caused by the industrial soot in Lancashire factories. An experiment that didn’t control for place of residence or economic status (factory worker versus salesclerk, say) wouldn’t be able to detect that causal link. But by assembling an experimental group and a control group that were broadly similar to each other [demographically], Hill and Doll could investigate whether there was a meaningful difference between the two groups in terms of smoking habits.

In the end, 709 people with lung cancer were interviewed about their smoking history, with the same number in the control group. Hill and Doll … explored those histories along different dimensions: average cigarettes smoked per day; total tobacco consumed over one’s lifetime; age when the subject began smoking. Once the numbers had been crunched, the results were overwhelming. “Whichever measure of smoking is taken,” Hill and Doll wrote, “the same result is obtained—namely, a significant and clear relationship between smoking and carcinoma of the lung.”°

At the end of the paper … Hill and Doll made a rough attempt to evaluate the impact of heavy smoking on the probability of contracting lung cancer. By their estimate, a person who smoked more than a pack a day was fifty times more likely to develop lung cancer than a nonsmoker. The number was shocking at the time, but we now know it to have been a wild understatement of the risk. Heavy smokers are in fact closer to five hundred times more likely to develop lung cancer than nonsmokers.

Despite the overwhelming evidence the study conveyed, and the rigor of its experimental design, the 1950 paper they published …was initially dismissed by the medical establishment. Years later, Doll was asked why so many authorities ignored the obvious evidence that he and Hill had accumulated. “One of the problems we found in trying to convince the scientific community,” he explained, “was that thinking at that time was dominated by the discovery of bacteria such as diphtheria, typhoid, and the tubercle, which had been the basis for the big advances in medicine in the last decades of the nineteenth century.

When it came to drawing conclusions from an epidemiology study, scientists tended to use the rules that had been used to show that a particular germ was the cause of an infectious disease.” In a sense, the medical establishment had been blinded by its own success identifying the causes of other diseases. While an overwhelming number of lung cancer patients had turned out to be heavy smokers, there were still a number of nonsmokers who had suffered from the disease. Using the old paradigm, those nonsmokers were like finding a cholera patient who had never ingested the Vibrio cholerae bacterium. “But, of course, nobody was saying [smoking] was the cause; what we were saying is that it is a cause,” Doll explained. “People didn’t realize that these chronic diseases could have multiple causes.”

Undeterred, Hill and Doll set out to conduct another experiment…. They decided to see if they could predict cases of lung cancer by analyzing people’s cigarette use and health outcomes over many years. This time they used physicians themselves as the subjects, sending out questionnaires to more than lift; thousand doctors in the United Kingdom, interviewing them about their own smoking habits and then tracking their health over time. “We planned to do the study for five years,” Doll later recalled. “But within two and a half years. we already had 37 deaths from lung cancer [among smokers] and none in nonsmokers.” They published their results early, in 1954, in what is now considered a watershed moment in the scientific establishment’s understanding of the causal link between smoking and cancer.

In that 1954 paper, the experiment design proved to be less important than the unusual choice of subjects. Hill and Doll had originally decided to interview physicians because it was easier to follow up with them to track their health and cigarette use over the ensuing years. But the decision proved to have additional benefits. “It turned out to have been very fortunate to have chosen doctors, from a number of points of view,” Doll noted. “One was that the medical profession in this country became convinced of the findings quicker than anywhere else. They said, ‘Goodness Smoking kills doctors, it must be very serious.'” Exactly ten years after the publication of Hill and Doll’s second investigation…, the United States [Surgeon General,] Luther Terry, a physician, famously issued his Report on the Health Consequences of Smoking, which officially declared that cigarettes posed a significant health threat. (After nervously puffing on a cigarette on the way to the announcement, Terry was asked during the press conference whether he himself was a smoker. “No,” he replied. When asked how long it had been since lie had quit, he replied, “Twenty minutes.”) …When Hill and Doll interviewed their first patients in the London hospitals, more than 50 percent of the UK population were active smokers. Today the number is just 16 percent. Quitting smoking before the age of thirty-five is now estimated to extend your life expectancy by as much as nine years.

Steven Johnson’s book is fantastic and you should get it and read the rest too! 

Rich elites used to live shorter lives than the average person (who was extremely impoverished by comparison) because elites didn’t know how to use their money to buy longer lives.  Even still in the mid 1960s, medical doctors and other rich Americans were more likely to smoke tobacco than poorer Americans.  That abruptly changed after 1964 when the US Surgeon General announced that smoking causes cancer and other illnesses.  Rich people, and especially doctors, gave up smoking which added many years to their life expectancy.  Less educated Americans haven’t given it up which has contributed to increasing inequality of life expectancy between the rich and the poor in America.  Whereas impoverished people used to live longer than rich elites, science has dramatically boosted the life expectancy of elites and the benefits have not trickled down as much to poor Americans.  Today people in America’s richest neighborhoods live 30 years longer than the average in our poorest neighborhoods. 

Without statistical methods in general and RCTs in particular, medical care would have had produced very benefit in the last half century.  In earlier times, scientific advances had such big effects that they were obvious without using statistics.  For example,  sanitation, pasteurization, and water purification had such big, immediate effects that RCTs weren’t needed.  But without RCTs, we would have no COVID vaccines today and no idea what works for treating COVID because it isn’t obvious without statistics.  Even with massive statistical evidence, it is still very hard to convince many Americans that smoking causes cancer and vaccines save lives. 

Posted in Development, Health, statistics

Median inflation (or other trimmed means) work best because of high variation in price changes

The standard inflation indexes track the average change in prices and one of the problems with using any average that is disproportionately influenced by outliers that are highly volatile due to market forces that have nothing to do with monetary policy and central banks need to measure how much the things that they control are affecting inflation so they can adjust what they can directly control: the money supply and short-term interest rates.

Kevin Drum show that the median price change is one of the best measures for adjusting monetary policy:

Here it is:

This chart shows three measures of inflation that are designed to avoid the variability of headline CPI and give a better look at the real level of inflationary pressures:

Core CPI omits food and fuel because they jump around a lot and don’t really tell us anything about underlying inflation.

Trimmed mean CPI cuts off the biggest gainers and losers so that a few outliers don’t affect the bulk of the items in the CPI basket.

Median CPI looks at the inflation rate of the median item in the CPI basket.

For those of you who want evidence that inflation isn’t all that bad right now, median CPI is your hot ticket. Not only is it relatively low at 3.5%, but it’s only 36% above its average 2016-2019 level. The others are 116% and 135% above their 2016-2019 averages.

Here is the conclusion of a recent paper that studied the accuracy of various measures of core inflation:

The last two years have been highly informative about the behavior of alternative core measures. Headline inflation has fluctuated erratically….XFE [eXcluding and Food Energy] inflation has performed quite poorly….Fixed-exclusion measures of core that exclude a wider set of industries, such as the Atlanta Fed’s sticky-price inflation rate, have performed better, but the most successful measures have been weighted medians and trimmed means.

When the inflation rate rises from 2% to 7%, a lot of Americans get very upset and it seems to be as upsetting as an equivalent rise in unemployment even though economic statistics show that the real harm to society of 7% unemployment is far higher than 7% inflation. I’ve always been puzzled why people hate such a modest increase in inflation, but one reason is probably that most Americans pay much more attention to some prices than to others. In particular, they pay attention to the prices of necessities like energy and food which are also the most volatile prices. So the misery of high inflation isn’t really the average inflation, but the inflation in prices that have a bigger impact on lifestyle like food and gas and used cars which are all soaring currently. So perhaps we should create a psychological price index that is weighted according to how strongly people feel about the price of each good. That wouldn’t be as good for the central bank to know how to direct monetary policy, but it would help explain when inflation is most upsetting to people.

Of course, there are always some sort of prices that upset someone. When real wages rise, that makes more people happy because most of us get most of our income from work, but it upsets people who get most of their money from owning stuff (the capitalists). And according to Jonathan Nitzan and Shimshon Bichler, inflation increases typically does the reverse of this and makes the capitalists richer and workers poorer. That is probably a big real reason why ordinary Americans hate inflation.

Blair Fix is an interesting thinker who argues that the amount of variation in prices is so huge that estimating inflation is useless. I don’t go that far, but his analysis is fascinating and ground-breakingly creative.

I can calculate the average of any conceivable set of numbers. But that doesn’t mean my calculation will be informative. That’s because averages define a central tendency, yet do not indicate if this tendency actual exists.

Here’s an example. Suppose two people have an average net worth of $100 billion. Is this a central tendency? Perhaps … if our two people are Warren Buffet (net worth $104 billion) and Mukesh Ambani (net worth $96 billion). Both have close to the same wealth. But what if the two people are Jeff Bezos (net worth $200 billion) and me (net worth $0 billion)? In this case, the average misleads more than it informs.

Of course, scientists are aware of this problem. That’s why they are trained to report averages together with a measure of variation. Doing so gives a sense for the meaningfulness of the average.

Any measure of variation will do, but the most popular is the ‘standard deviation’, which measures the average deviation away from the mean.3 Returning to my wealth example, reporting the standard deviation of wealth tells us when the average measures a real central tendency, and when it does not.

For instance, Warren Buffet and Mukesh Ambani have an average net worth of $100 billion, with a net-worth standard deviation of $5.7 billion. The fact that the variation is small (about 0.06 times the average) indicates that there is a real central tendency. In contrast, Jeff Bezos and I have an average net worth of $100 billion, with a standard deviation of $141 billion. This enormous variation (1.4 times the average) indicates that there is no central tendency in the raw data. So the average is uninformative…

Figure 3 shows the price change of every commodity tracked by the consumer price index. Instead of clustering tightly around the average price level (the ‘official CPI’), real-world commodities have a mind of their own. Their prices head in all sorts of directions — often in ways that seem unrelated to the movement of the average price.

Figure 3: Price change in the real world. The black line shows the change in the US consumer price index since January 1, 2020. The colored lines show the indexed price of all the individual commodities tracked by the CPI. Many commodities are tracked in multiple locations. [Sources and methods]

Notice how plotting the price-change of all CPI commodities alters the inflation story. Looking at Figure 3, no one would conclude that all prices are inflating uniformly. Yet when we looked at the consumer price index alone, this conclusion seemed plausible.

When we study the whole range of price change, we see that inflation is a messy business. The numbers tell us as much. Since January 1, 2020, the consumer price index rose by 7.3%. That value seems significant … until we measure price-change variation. Over the same period, the standard deviation of price change was 10.7%. So the variation in price change was about 1.5 times larger than the price-change average.5

To put this variation in perspective, let’s return to our wealth example. Jeff Bezos and I have an average wealth of $100 billion. But this value does not indicate a real central tendency. Jeff Bezos is worth $200 billion. I’m worth $0 billion. We can tell that the average is misleading by measuring the standard deviation of our wealth, which happens to be $141 billion. So the variation in our wealth is about 1.4 times the average.

This ratio of 1.4, you’ll notice, is actually less than the ratio of 1.5 we found for price-change variation. So if we conclude that it’s rather meaningless to average my wealth with Jeff Bezos’s wealth, we should also conclude that the movement of the consumer price index is quite meaningless. Both averages mislead more than they inform…

The real story of inflation — the one that goes largely unreported — is of wildly divergent price change among different groups of commodities. Figure 4 shows how this inflation has played out across 12 major commodity groups tracked by the US consumer price index.

Figure 4: US price change by commodity group. Box plots show the range of price change between Jan. 2020 and Oct. 2021, for US CPI commodities classified into 12 major groups. Here’s how to read the box plot. The thick vertical line indicates the median value. The ‘box’ shows the middle 50% of the data. And the line shows the range of the data, excluding outliers. [Sources and methods]

…We can see that inflation varies greatly between different commodity groups. Some groups, like ‘men’s apparel’, have experienced little (if any) inflation. Other groups, like ‘private transportation’, have seen massive price hikes. Figure 4 also shows that inflation varies greatly within each commodity group. Inflation often coexists with deflation — a fact that’s evident when the boxes cross the dashed red line.

So the real inflation story, which goes largely undiscussed, is that price change is remarkably non-uniform.

…let’s look at the long-term history of US price-change variation. Figure 6 shows the data. I start by replotting the average inflation rate (blue line). But then I add some much-needed information — the range of price change across all CPI commodities. That’s the blue region, which plots the 95% range for the annual price change of all commodities (in all locations) tracked by the CPI. The price-change range is … rather large.

Figure 6: The history of US price-change variation. The blue line shows the annual change in the consumer price index, replotted from Figure 5. The shaded region shows the 95% range (the range for the middle 95% of the data) in the annual price change of all commodities (in all locations) tracked by the CPI. [Sources and methods]

The evidence in Figure 6 suggests that our current situation is not unusual. Since the CPI data began in 1913, the US inflation rate averaged 2.8%. But over the same period, the standard deviation of annual price change averaged 5.2%. So the inflation variation was historically about 1.8 times larger than the inflation average. To remind you, the variation between Jeff Bezos’s wealth and my wealth was only 1.4 times our average wealth. So looking at the average US inflation rate is even less meaningful than averaging Bezos’s wealth with my own.

To summarize, the data is pretty clear: the historical norm has been for price-change variation to trump the average rate of inflation. So why does this variation go unreported?

It is a good point. Other economic indexes like the S&P500 don’t report variance either and they should. But arguing that a large coefficient of variation makes an average meaningless isn’t right either because suppose inflation is always zero. The coefficient of variation could be infinite even though the standard deviation of prices were only infinitessimally small. Whenever there is data that is both positive and negative as with inflation data, the coefficient of variation is always going to be huge. Blair Fix is right that variation is important, but the high variation of inflation data reduces the usefulness of inflation as a statistic, but it’s still useful and important.

Posted in Macro

Why Netflix doesn’t care if you “illegally” share your account with friends

In Michael Heller and James Salzman’s book, Mine!: How the Hidden Rules of Ownership Control Our Lives, they say that streaming video companies like HBO Max do not care that people get one account and share it with friends. This technically violates their terms of service, but in practice, they don’t care because they are hoping for benefits:

  1. Freeloaders will develop an habit/addiction for the product and some freeloaders will grow up and get subscriptions of their own.
  2. The company brand is developing good will that will generate future sales.
  3. Shows are developing more ‘buzz’ which generates more paying customers now.
  4. Less revenues (and eyeballs) for rival companies like Netflix right now.

Another way to look at their acceptance of freeloaders is through the lens of price discrimination. This is deliberately charging a different markup to different customers. Customers with lower elasticity of demand get charged a higher markup than more flexible customers who aren’t going to pay for higher markups anyhow. That allows a company to both soak customers with a higher willingness and/or ability to pay as well as get a little bit of benefit from customers that wouldn’t pay a high price.

Price discriminating companies always charge at least their marginal cost, but for products that have zero marginal cost, it is rational to offer a version for free if that version is complementary with other sales. There is a danger that a free version could cannibalize sales of the profitable version, but free samples are also a potent marketing expenditure that can generate future sales and it is complementary with the primary strategy for the industry. The streaming video industry is growing rapidly and there is a danger that only a few of the biggest companies will survive. So they are in a grow-market-share phase right now. They are all competing for market share and so anything that generates a bigger market share than their rivals is good. If some customers are freeloading on HBO, then that could mean less customers for Netflix and Netfix is probably a bigger a threat for HBO than freeloaders.

Someday the industry will transition to a maximize-current-profits strategy and their price discrimination scheme will change and they will focus more on getting rid of “illegal” account sharing, but right now, it is a win-win to just let it slide.

Posted in Managerial Micro

Looking to buy cheap land? Here is the cheapest in the USA.

Check out this map of land values across the USA.

The wealthy landowners with the most acreage tend to buy in areas with cheap land. Almost half of the cheap land in northern Maine is owned by just a few families!

Some of the oddities on the private land map are due to the history of how private railroad construction was subsidized by the US government. The government gave a corridor of land to railroad companies that they could sell to make money to pay for construction along certain routes:

Today, you can still see strips of private ownership along these routes such as the strip through Nevada where the Central Pacific railroad runs and through Arizona that is a legacy of the Atlantic & Pacific route.

The main thing that makes land valuable is having more neighbors because higher population density means more customers, more jobs, and higher productivity per acre as shown in this map:

If we switched our tax system more towards a land-value tax, it would increase population density and also make land cheaper.

Looking for more data about real estate prices? Numbeo has some interesting numbers although they seem to be crowdsourced which can be inaccurate. According to them, the USA has some of the most affordable cities in the world when adjusted for earning power. Of course, for travelers, other things are more important than real estate for determining cost of living.

Posted in Public Finance, Real Estate

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 86 other followers

Blog Archive