January 19, 2021
Infections by the corona virus in Finland have concentrated most heavily to the (adult) individuals with the lowest incomes. However, infections have also been prevalent in the very highest income class. During the first wave in the spring (March-June), corona virus infections were most common in the highest income class, whereas during the fall (July-November) the infections have affected especially the very lowest income class. During the fall, the level of infections among those with a secondary education has been close to the level among those with a graduate degree.
The absolute and relative numbers of corona virus infections also exhibit clear differences between professions, but these differences are influenced by many background factors and testing practices independent of the place of employment. After taking certain background factors into account, the largest risks for corona virus infections are observed among nursing experts, nurses, personal care workers and other healthcare professionals. According to the results, painters and building structure cleaners also have large infection risks, but this estimate contains a lot of statistical uncertainty. The results presented in this Situation Room report can be utilized for instance to support decision making with regards to restrictions and the order of vaccination.
Concerns have been raised that the corona virus pandemic will most strongly affect the people already in the weakest positions (Ahmed et. al., 2020). In practice, people can be exposed to very different risks of contracting the virus, for instance due to different job and housing arrangements. The possibilities for remote work are limited in certain professions, and especially people of lower socioeconomic status may find it hard to modify their job duties along the restrictions imposed by society to combat the virus.
International research literature has studied the socioeconomic differences in corona virus infections, and the impact of the virus on health outcomes (Magnusson ym. 2021, Killerby ym. 2020, Price-Haygood ym. 2020). Finnish research on the incidence of the virus and its health impacts in different socioeconomic groups, however, I still limited.
This report establishes an overview of the current situation with regards to the incidence of corona virus infections in Finland across educational, income, and professional categories. We are especially interested in the potentially vulnerable socioeconomic groups, such as the lower income classes, and the professions where possibilities for remote work are limited.
Corona infections across income deciles
Figure 1 below presents the numbers of adults who have contracted the corona virus across income deciles. Each bar (decile) contains an equal share (10%) of the Finnish population, meaning that if there was no association between income and risk of infection, the bars should be approximately equally tall. The income deciles have been constructed according to disposable money income in 2018, because more recent information on all income types, especially transfers, is not yet available. Data on the corona virus infections comes from the Finnish National Infectious Diseases Register, which for 2020 covers the weeks 1-47. Income data is available for around 83% of the persons registered as infected. Of those for whom no income data is available, 60% are under 18 years old.
The figure shows that infections are clearly the most prevalent in the lowest income class. The upper limit of disposable money income for belonging in the lowest decile is 10 600 euros per year. Although the number of infections is the highest in the lowest income decile, above that, apart from two exceptions, the number rises with income.
Especially during the fall, the virus was thought to be spreading among students. However, the data shows that during 2020 most of the infections in the lowest decile were not from students. In figure 2, the upper limit of the lowest decile is 11 500 euros per year. Even though students are here left out of consideration, the lowest decile is clearly distinguished from the next deciles. Perhaps a bit surprisingly, the number of infections is the lowest in the second and third deciles, after which the numbers rise with income. With students not included, the number of infections is actually the highest in the highest income decile.
During the spring, corona virus infections were suspected to come from high income individuals returning from abroad, while during the fall the infections may have increased in the lower income classes for example among students. Figures 3 and 4 below show that the distributions of infections across income classes were very different during the spring and the fall. Students are included in these figures. During the spring, there were more infections in the higher deciles, while during the fall the infections have concentrated in the lower deciles.
Corona infections by level of education
Figure 5 below shows shows the adults infected by the corona virus by level of education during the spring and the winter. As with income, the information on the level of education is also from 2018. The information is available for around 60% of the infected persons. From those infected for whom no data on level of education is available, 45% are underage. The figure also presents the number of infections among those for whom the education level is missing. The level of education may be missing if the person has only graduated from basic education, or moved to Finland, after 2018.
Figure 5 presents the absolute number of infections within each category. When the spring and the fall numbers are added together, we see that most people who contracted the virus had either a secondary education or no information on the level of education. During the fall, there were more covid patients with secondary education than during the spring. However, the comparison between the groups is made difficult because the number of people in each category is not taken into account.
Figure 6 shows the relative numbers of infections in each category. The numbers are relative to the number of people with the given level of education in the population. This reveals that, in relative terms, there were more infections among people with graduate degrees than people with lower levels of education both during the spring and the fall. However, the largest within-class growth of the relative numbers from spring to fall happened among people with a secondary education. During the fall their relative infection numbers have been almost as high as for people with graduate degrees.
Corona infections in different professions
People in different professions can be in a very different position when it comes to risk of contracting the corona virus. Remote work is used to prevent the spread if the virus, but the possibilities for remote work are clearly lower in many low-paid jobs, for instance in customer service and construction professions. Also, working in health care and education has been feared to expose one to the virus more than in other professions.
Below, we present figures of corona virus contractions by profession. When interpreting the figures, it is good to keep I mind that they do not tell us where the infections come from. It is possible that the person has contracted the virus at work, or outside the workplace – e.g. from a family member or in a hobby. Furthermore, people in different professions differ from each other in multiple ways, which makes the interpretation more difficult. For instance, people working in low-paying professions might have a larger family or less living space per person, which would also help spread the virus. There can also be differences in the amounts of testing between professions, which may affect the number of observed cases. However, while estimating the relative risk of exposure to the coronavirus due to each profession, we do control for some background factors, such as the sex, age, national background, and municipality of residence of the worker.
The information on the profession comes from the income register of the year 2020. It is available for around 64% of those who have contracted the virus. The profession categories utilized are based on the Classification of Occupations by Statistics Finland. We first present the absolute and relative numbers in each profession category. After this, we look at the infection risk, or the relative number of infections, of the different professions with multiple background characteristics taken into account utilizing regression (linear and logistic) models. The regression models also aim to consider the statistical uncertainty in the estimates via confidence intervals.
Figure 7 below shows the 20 professions where the number of infections has been the highest up to November 2020. The absolute number is the largest for personal care workers, sales workers, delivery workers, and cleaners. Nurses and childcare workers also have a relatively large number of infections. What these professions have in common is that they are quite low-paid, and the possibilities for remote work are limited. The interpretation, however, is made more difficult by the fact that the sizes of the profession groups are very different, and especially the categories with the most infections contain relatively large numbers of workers. For this reason, it is important to also look at the relative numbers of infections.
Figure 8 below shows the 20 professions with the largest relative numbers of infections. The relative numbers are calculated by dividing the absolute numbers by the number of people with a given profession in the population according to the income register. Paintersa dn building structure cleaners are distinguished from the other professions clearly with more than 1.2% of the people in the profession contracting the virus by the end of week 47.
Although the relative numbers are suggestive of the infection risk of each profession, the estimate can be affected by various background characteristics of the workers that make up the group, such as age, national background and municipality of residence. These factors can especially affect the numbers among the two highest ranking professions in figure 8, i.e. painters and building structure cleaners, and drivers. Different background characteristics make the comparison between the infection risks associated with each occupation more difficult. To take the background characteristics into account, they have to be kept constant in comparisons by utilizing statistical methods and regression models.
In the next figures, we present estimates based on linear and logistic regressions for the infection risks of different professions, compared to a reference group. The comparison group are those who, according to the income register, do not work in any occupation. In linear regression, the per-profession risks are estimated as the coefficients of indicator-variables, and in logistic regression as the so-called odds ratios.
The following factors are kept constant in the regression models: age, sex, national background, and the municipality of residence. By keeping these factors constant, we can at least partially take into account the differing background characteristics of the persons in different professions, and their association with the risk of contracting the corona virus. For instance, including the national background variable makes sure that people with a Finnish background are compared with each other, while people born in Finland but with a foreign background are compared to each other, and people born outside of Finland to each other. If a specific profession had a disproportionate number of workers with a foreign background, keeping the background constant controls for the potential effect of foreign background on the risk of contracting the virus. Similarly, keeping age and the municipality of residence constant controls for the potentially different age structure of workers in different occupations, and the higher incidence of some occupations in larger cities.
From the above figures, figure 9 presents the results of the linear regression and figure 10 the results of the logistic regression. Both figures cover all people between the ages of 20 and 64 living in Finland. The sample size is a bit over 3 million.
In the linear model, the infection risk is assessed via the probability of contracting the virus within a profession. The x-axis in figure 9 shows how much higher the probability is for a given profession compared to the reference group, i.e. the people with no profession. In the comparison group, around 0.38% of individuals have contracted the virus according to the data.
The results suggest that nurses have around 0.5 percentage points higher likelihood of contracting the virus compared to people with no profession, once age, national background and municipality of residence are taken into account. This can be interpreted as five more infections among 1000 nurses than among 1000 people in the reference group. The line going through the dot representing the point-estimate tells the confidence interval of the estimate. If the line doesn’t touch the red vertical line, the estimate is statistically significantly different from the comparison group. In the linear model, the intervals are quite wide for many professions, and overlap with each other. This means that the estimates for those professions are not statistically significantly different from each other.
According to the linear model, the greatest risks of infection are for painters and building structure cleaners, nursing experts, sports and fitness workers, drivers, nurses, and personal care workers. Out of these groups, however, the estimate for painters and building structure cleaners is statistically not very significant, because the confidence interval contains the number zero (the p-value is 0.09). This means that the infection risk of painters and building structure cleaners contains a lot of statistical uncertainty.
In the logistic regression, the infection risk is assessed as an odds ratio, i.e. the ratio between the probability of contracting the disease and not contracting it. For instance, if the probability of contracting the virus for a given profession is 1%, the odds ratio is 0.01/0.99 (which equals approximately 1%). In figure 10, the odds ratio meanwhile depicts the ratio between the risks of a given profession and the reference group. For instance, the odds ratio for nursing experts is 3.6. This means that their infection risk is more than three times as large as the infection risk of the people without profession. Although the relative differences in the infection risks of these different groups are large, the differences in absolute terms are still less than one percentage point. The results of the linear and logistic regressions are mostly in line with each other.
According to the logistic model, the greatest infection risks are for nursing experts, painters and building structure cleaners, nurses, personal care workers and other health care professionals.
The confidence intervals for the estimates in the case of the logistic regression are somewhat narrower than for the linear regression, meaning that the statistical uncertainty is smaller. Still, the intervals overlap for several professions. In the logistic model, the estimate for painters and building structure cleaners is also statistically significantly different from one (which is the odds ratio of the comparison group).
Ahmed F, Ahmed N, Pissarides C, Stiglitz J. (2020) Why inequality could spread COVID-19. Lancet Public Health. 5(5):e240.
Killerby ME, Link-Gelles R, Haight SC, ym. (2020) Characteristics Associated with Hospitalization Among Patients with COVID-19 — Metropolitan Atlanta, Georgia, March–April 2020. Morbidity and Mortality Weekly Report (MMWR).
Magnusson K, Nygård, K, Vold, L ja Telle, K (2021) Occupational risk of COVID-19 in the 1st vs 2nd wave of infection. MedRxiv.
Price-Haygood EG, Burton J, Fort D, Seoane L. (2020) Hospitalization and Mortality among Black Patients and White Patients with Covid-19. New England Journal of Medicine; 382:2534-2543.
Roser M, Ritchie
H, Ortiz-Ospina E ja Hasell J (2020) Coronavirus Pandemic (COVID-19).
OurWorldInData.org. Retrieved from: ‘https://ourworldindata.org/coronavirus’