Looking at the figures published by John Hopkins CSSE at https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
As I write, 204 out of 213 deaths took place in Hubei province, with 5806 confirmed cases out of 9925.
So, in the province, where the epidemy started the mortality is over 3%.
The second province in number of confirmed cases is Zheijang with 538 cases, but not a single death yet. How do you folks interpret this?
One of the difficulties with a fast moving virus like this is that the numbers can be very confusing until you remember that it is behaving mostly as an exponential growth curve and that there are time delays involved. Also, the parameters involved have ranges, and specific conditions in different areas can either mask the numbers, or make them look worse, or distort them in other ways.
Over time as various control strategies are employed, as people do various things, the basic parameters will or at least can change values. If the virus mutates, that can also have a large impact - though, so far that is not the case here.
More than this, non-obvious things like differences in age distributions, gender ratios, health status, other life style and disease factors (smoking, diabetes, ...) come into play and can have large importance.
Do not take any of what follows as correct, true, or valid for predicting anything. These are crude numbers and a very simple model to get a ballpark idea of what is happening, and to help answer Bernard's question - (tryin to make sense of the numbers).
Taking all of that, plus inherent randomness and other factors into account - the disease outbreak looks to be something like this:
1) The virus is spreading person to person.
2) The "average" contact transmission (R0) seems to be between 2 and 6 people infected by each person with the disease. The average R0 seems to be generally between 2.2 and 5.6 with an average from all of the available data of somewhere near 2.7
3) The growth of the number of people infected as reflected by "confirmed" infections was initially growing at about a factor of 1.32 per day. More recently it has averaged about 1.62 per day. Through the whole period factors in the range of 1.32 to 1.62 can reasonably match various parts of the data. Let's use a midpoint value of 1.42 (because it shows up a lot - but then too, so do 1.57 and 1.62).
4) Using that daily growth ratio and the R0, we can estimate the time from exposure to infectivity of the next generation - i.e. the generation time. Do that by taking the natural log of the R0 and dividing by the natural log of the growth factor - so ln(2.7)/ln(1.42) for example. In this case, that equals 2.83 days. That seems awfully fast. But then too, this is an "average" or more properly an equivalent behavior as the real distributions show. Another common pairing seems to be an R0 of 4.08 and a growth of 1.62 => 2.91 days per generation.
That now sets the basic form for the growth of the infected population. Even minor changes in those parameters can cause wild changes in the results. Do not rely on the results. They are cautions only. This is an exponential growth curve. So the trick then for this very simple (overly simple) model is to use the real data and various choices of parameters to best fit the parameters to the data, while also sanity checking that those make any sense. Even then, take the results with a huge dose of skepticism. This is an overly simple model.
Now, we have the basics, an Ro =2.7, A growth factor of 1.42/day, and a generation time of 2.83 days.
Next let's look at what we know from field reports. People apparently show symptoms about 5-7 days after exposure. Lets say that on average they go to hospital then. They get counted as "suspect cases". Over the next day, they are tested by PCR and the results return. They are either now cleared of having this virus, or they are confirmed. Given the size of the outbreak in Wuhan (Hubei), the vast majority of those are confirmed.
The confirmed population then is one day after the suspect count. I.e. it represents a population 1 day earlier in the growth and spread of the virus.
Next lets look at the people who died. We know hospitalizations last an average of 23.5 days. Using the data from the hospitals, we can estimate how far back in time their cohort of infected people was. One report put that at 6 days. Another suggested about 5.9 days. Let's say 6 days. So, take the count of those who died times the growth factor raised to the power of the number of days = 1.42^6 = 8.2 times the count of dead.
On January 29, the confirmed count was 8,650. The dead count was 170. And the survived count was 130. That report may still not show effects of the quarantine, and so should be good for our purposes.
Take the 8,650 count of confirmed infected persons today and move it back in time 6 days by dividing by 1.42^6 = 1,055 people in the cohort those who died came from. Now divide 170 by 1,055. The result = 16.1% of the cohort dying. Take that with a huge grain of salt. We know from SARS that it's death rate averaged about 10%. MERS averaged about 40%. So 16% is not unreasonable. But this is a really tentative calculation based on lots of assumptions using exponential growth data. It would likely be safe to assume that the actual death rate is somewhere in the range of 10-20% based on these parameters.
We do not have (or at least I have not seen) good data on the average time from admission to being declared disease free. But we can work the problem backward to get an estimate. Take the current count 8,650. We need to move back X days to the cohort that the survivors came from. Take the survivor count of 130 and divide by the fraction surviving to estimate the original cohort they came from. 130 /(1 - 0.161) = 155. Now 8,650 = 155 * 1.42^X, solve for X = 11.5 days.
Now sanity check that. Is it reasonable that people take 5.5 days longer to be confirmed recovered than to die based on what we know? It is well within the 23.5 days. But that raises a question why on average people are still in hospital for 12 days longer than on average being confirmed to have survived. That might be reasonable to avoid spreading infections by assuring they are no longer contagious. I simply do not know if that is reasonable. The large time span with an intense need for beds makes me suspect that tis points to a serious flaw in the assumptions - an error of some sort.
Anyway, what you can see from this is that the estimate of those currently infected with pure exponential growth and no intervening factors would be about 1.42^7 = 11.64 times the current confirmed infected count. The count of those suspected to be infected should be about 1.42 times the infected count. The count of those who have died should be about 1/45th of the current confirmed count. And the count fo those who survived should be about 70-90% of the count of those who died (76% by the math in this example).
So now lets go back and think about this again. We aren't dealing with just one generation of people. The counts represent the time summed total of many different generations. To do this calculation more correctly (still an overly simple model), we would have to do all of those in parallel and add them. We would also need to do a stochastic calculation using the uncertainty bands for all of this. As you can see, even using this "simple" exponential growth model, the problem gets complicated quickly.
But the complications get worse. We have information now that this disease predominantly kills people over age 55. The precise data for that is also messy, as the death rate has to be calculated using this same messy exponential math (or more complicated models). It is all too easy to get that wrong. That can be confused by things like how long it takes for elderly people to succumb compared to younger people. The younger people who presumably survive longer before dying represent an earlier smaller cohort of people. If we estimate their death rate from current data, it will look like they are more resistant even if they aren't. Etc...
We also seem to be seeing a strong gender difference with 70% of the fatalities being male. That too may be caused by artificial biases. For example, if men have "better" access to health care, they may get earlier treatment, where the women might die at home and be counted later. Or, the men might be the ones going out into areas where they become infected, while the women don't. I do not know that any of these are true for this population. Nor do I mean to speculate that they are or even might be. I just mean to point out that fairly simple biases like these can make large differences in the short term results while the virus is spreading exponentially.
Once the quarantine went into effect, huge changes happened. People know to wear masks and use good hygiene practices to avoid spreading the disease. And these are societally mandated and enforced. Also, because of other factors, people are staying indoors and not frequenting the usual places. This dramatically reduces the opportunity for viral spread. But with an at least 2.85 day generation time and a 5-7 day time for appearance of symptoms, the effects of these changes won't be seen at the earliest until 5-7 days after they began. That is just about now.
Does this help?
Sam
Bearing all of that in mind. We still know far too little to be certain about much. We can be certain that it is a fast spreading lethal disease.