Thoughts on Probability and Statistics | The 2020 Novel Coronavirus Outbreak

The uncertainty of an estimate doesn’t always go both ways. For 2019-nCoV, one-sided uncertainty certainly indicates underestimation.

The Wuhan coronavirus (2019-nCoV) is currently spreading throughout China and the world. As of Jan 25, 12am EST, there are 1,354 confirmed cases and 41 confirmed deaths throughout the world. While there are questions about whether these statistics are true or not, people are not appreciating the fact that these statistics are not like the other statistics they encounter, and this is because of the way data is collected.

Let me demonstrate the difference, with two examples.

(1) Statistics with Two-Sided, Symmetric Uncertainty

Consider an election, which is not like a disease outbreak. You might want to know what proportion of voters support a certain presidential…


Thoughts on Probability and Statistics

The total deaths from epidemics since 1900 has been dominated by the very few, most severe epidemics. This teaches us about the true definition of extreme values in statistics.

What is a black swan event, and where do they come from?

The idea of a black swan dates as far back as the time of the Roman poet Juvenal.

rara avis in terris nigroque simillima cygno — From The Satires, Line 6.165

Translation: A rare bird in these lands, very much like a black swan.

Historically, it was presumed that black swans did not exist. Thus, when Dutch explorers became the first Europeans to see a black swan, it was a big shock — something thought to be impossible suddenly became possible.

The idea was later generalized in a book series and became more commonly known. However, it is still not…


Common measurements of volatility and correlation can be highly volatile and misleading, drastically underestimating the true risk of an investment.

Investing is a risky endeavour. Take a wrong step, and you could find your cash burned up. So how do you pick the right investments? Should you trust news? Should you trust the brokers? Should you do your own research? Or should you pay someone else to manage your investments?

Among the more sophisticated investors are those who rely on techniques from mathematics and statistics. These people compare different investment options in terms of return and risk, in order to construct the optimal portfolio.

Reducing the volatility of an investment portfolio

It’s obvious why people would want to maximise their returns. But why do people try to…


Trying to compute something? It might be too slow. Drop your math textbooks. Optimize your code with complexity theory instead.

Calculating a running average

In my last post on insurance, I wanted to calculate a running mean (the mean from the beginning up to the current time t). This was for a time series with 1,000,000 time-steps. In order words, I had to calculate 1,000,000 means. How did I do this?

Algorithm 1: straight out of a math (and not a CS) textbook

If you look inside any standard math textbook, you would find something like this.

Definition of the mean gives us the mean from time 1 up to time t.

Using this formula directly, you might then have an algorithm like this:

# X is a time-series with length T.
runningMean = [X[0]]
for t in range(1,T):
runningMean.append(sum(X[0:t])/t)

But when you actually run it, the speed of the execution…


Mathematics, risk and society. How a theorem in probability shows that pro-social behaviour (insurance) mitigates individual risk and supports long-term survival.

Contents

Introduction — A simple model of savings
Part I — The benefits of an insurance scheme
Part II — The law of large numbers
Part III — When insurance schemes fail
Appendix — Math for the simulation

Inspired by Ole Peters (lecture) and Nassim Nicholas Taleb.

This article strings together a surprising series of thoughts which I’ve had over the past year. Expected values, ensemble averages, time averages, empirical vs theoretical mean. Insurance, cooperation, sharing, culture, tradition, conservatism and politics. Risk, correlation, contagion and catastrophe. Portfolio diversification and market delusion. Survival, elimination, and evolution. …


The experts have great tools for science, but not for real life. How do we keep a distance of 1.5 metres? It’s not easy — but ancient wisdom can tell us what to do.

Social Distancing: “Is this far enough, officer?”

Which genius decided on the rules for social distancing?

This is the official advice from the Australian government on public gatherings during this COVID-19 coronavirus outbreak.

Stay 1.5 metres away from others

I don’t know how long 1.5 metres is, and I don’t have a ruler on me. Neither do my neighbours, and who knows if the other people around me know the number of centimetres in a metre.

Imagine standing in a queue, and you are standing too close to the person in front. The police catch you, and now you find yourself being questioned and lectured about the importance of the public health measures.

Even worse: no more than 1 person per 4 square metres.


The 2020 Novel Coronavirus Outbreak | Thoughts on Probability and Statistics

How false-negatives in diagnostic testing are leading to the release of infected people, motivating extreme containment measures. The COVID-19 outbreak, explained with Bayes’ Rule.

If you are reading this after 2020, please keep in mind that this post was written during the early stages of the COVID-19 pandemic, and hence, may not reflect a reality beyond this time.

We are currently in February 2020. Over the past month, a deadly virus has been spreading throughout China and the world, sending the infected to the ICU and trapping others in their homes. As authorities try to manage this crisis, they face the challenging issue of containment — sending the infected to quarantine, while allowing the non-infected to go free.

The Problem With Epidemics That Plagues The Authorities

Here is the scenario. You have…


Thoughts on Probability and Statistics

What is the “average” and how do we find it? Forget the formula — and get better at math.

How to Understand the Idea of “Average”

Formulas Are For Calculation, Not For Understanding

When people teach you about statistical concepts, you usually get a equation which is a formula for some quantity, like the arithmetic mean. Formulas are fine, but they are designed with calculation in mind. Usually, the equation will put the unknown on one side, and all known quantities on the other.

Like this.

Arithmetic and geometric mean formula, optimised for calculation

Unfortunately, this view does not help students develop a good understanding of concepts like the “average”. And as a result, it is not difficult to find people misapplying statistics, for example, using the arithmetic mean on financial returns data when the geometric mean makes more sense.

By…


The 2020 Novel Coronavirus Outbreak

Why the reported “mortality rate” of the Wuhan coronavirus (2019-nCoV) is misleading and distracts us from the severity of the outbreak in China and Asia.

Update (Feb 10): Some researchers have estimated the case-fatality rate. In my new post, I summarize their research and explain why it’s worse than it looks.

At the start of 2020, murmurs of a mysterious SARS-like coronavirus started to spread in the city of Wuhan, in Hubei, China. Soon enough, the world knew what was going on. This new type of virus, which was given various names such as Wuhan coronavirus, novel coronavirus and 2019-nCoV, brought back memories of the 2003 SARS outbreak which infected around 8000 people and killed 800 worldwide.

These viruses were similar in the way it…


Thoughts on Probability and Statistics

Suppose that you are collecting a sample of ratings data, which can range from 1 to 10. Let’s say that your sample consists of 100 points, and consider two possible samples.

  1. 10 people give a rating of each possible value from 1 to 10.
  2. 50 people give a rating of 1, and 50 people give a rating of 10.

The first sample corresponds to a uniform distribution over all 10 possible values.

Uniform Distribution

The second sample corresponds to a bimodal distribution at the extreme values 1 and 10.

Andy Chen

Math, stats, data. Influenced by the complex systems perspective. I prefer to take the critical view.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store