Object of this study

This study is about survey questions on happiness using verbal response options. The aim is to transform observed responses to such questions into numerical values from 0 to 10.

Differences in survey questions on happiness
The findings on happiness stored in this database are largely based on responses to single survey questions. All these questions pertain to the subjective enjoyment of one's life as a whole, yet they are not quite the same. Questions differ slightly in the appraisal they bring out and there are also variations in the way responses are recorded. These differences are summarized in the scheme below.

Substantive meaning


The kind of happiness addressed as denoted in the key-term used, e.g. 'happiness' or 'life-satisfaction


The period considered, e.g. 'generally' or 'these days'

Method of assessment


The technique by which happiness is assessed, in general population surveys typically single direct questions


On what kind of rating scale the responses are scored, e.g. response options designated with numbers, smilies or verbal labels


Number of degrees of happiness distinguished, e.g. 3, 7 or 10



Variation in phrasing of otherwise equivalent items, e.g. 'completely happy' as the highest response option instead of 'very happy'

These differences are acknowledged in the coding system used for happiness measures. See the Introductory Text to the Item bank of this World Database of Happiness, chapter 4

Strategies for comparison
These differences are a problem for this database, which has as its aim the comparison of findings on happiness across time and nations. If we limited such comparison to responses on identical questions, only a part of the available data can be used. To enhance the comparability of the available findings we follow two main strategies: 1) we group of the results of questions that are similar enough to allow meaningful comparison, and 2) we convert scores on these questions to one common scale. These strategies are explained in more detail in the introductory text to the catalog of 'Happiness in Nations', chapter 7.

Grouping equivalent questions
Questions that address the same substantive meaning are considered to be 'equivalent' items, even if they differ in rating of the responses. Equivalent items have the same focus code in our item classification, e.g. O-HL for questions about overall evaluations using the term 'happiness' and O-SLW for questions using the term 'satisfaction with life as a whole'. In this section on Happiness in Nations we group findings accordingly, responses to questions about 'life-satisfaction' are presented separately from ratings of life on a ladder ranging from 'best possible' to 'worst possible'. Within these categories we further sort on nation and time. This is to focus the comparative analysis on equivalent items.

Converting scores to a common scale
This leaves us with differences in rating of the responses. Otherwise equivalent questions are sometimes rated on numerical scales and sometimes on scales with verbal response options. These rating scales differ in the number of response options they offer and the verbal rating scales differ further in the words used to label response options. We deal with these differences by transforming the scores on these different scales to a range of 0 to 10. In this section on Happiness in Nations we present the transformed means and standard deviations next to the ones observed on the original scale. See for example table 111A (LINK)

Linear transformation
Conversion to a scale of 0 to10 is easy when responses are made on another numerical scale. In that case we apply linear transformation. Since response scales are mostly shorter than the 11-step 0 -10 scale, this involves 'stretching' of the responses on the original scale. For example, an average score of 7 on a 1 to 9 numerical scale is stretched to 7.5 on range 0-10. The formula used for this transformation is:  

Transformed mean on range 0-10 = (mean on original scale - lowest possible score on original scale) / (highest possible score on original scale - lowest possible score on original scale) x 100

Expert weighting of response options
Linear transformation is not appropriate in the case of rating scales using verbal response options. There are several reasons for this:

The main reason is that the difference between response options is more variable than with numerical scales. In the case of numerical scales we can assume that the difference between score 1 and score 2 is about the same as between score 2 and score 3. Yet distances between response options of verbal rating scales can differ considerably, e.g. the distance between very happy and 'quite happy', would seem to be smaller than the distance between 'quite happy' and 'not very happy'.

Another reason is that questions differ in the words used for response options, e.g. in the above example when the second option is labeled 'pretty happy' instead of 'quite happy'. As a result the distances between response options vary even more. These inaccuracies are magnified when we stretch the scores to range 0 -10. Since verbal response scales are typically much shorter than numerical scales, this gives rise to considerable distortion.

We solved these problems using a method proposed by Thurstone as early as in 19?? In Thurstone's technique judges rate the value of verbal response options on a numerical scale, e.g. 7 for 'fairly happy' on range 0 to 10. , e.g. 7, 2 for 'fairly happy' and 5,6 for 'not too happy'. When more than one judge is involved ,the average rating is used. These values are then used for computing weighted averages.

Earlier weighing study
We applied Thurstone's method in 1991 to some forty commonly used  verbal response scales. We rated all the response options on a range of 0 to 10, e.g. 7 for 'fairly happy' and 5 for 'not too happy. Twelve judges rated each of the response options on the forty scales. As the rating differed slightly we used the averages. For example this was 9,3 for 'very happy' and 7,2 for 'quite happy. These values are reported in section 7/3.3.2 of the introductory text to this catalog of Happiness in Nations. Until the present day these estimates have been used tor compute transformed means and averages.  Yet this assessment falls short for several reasons:

One limitation is that the weights were made for response scales in English, but that the judges were Dutch. Though all the judges were fluent in English, they may have missed some connotations that native speakers would acknowledge.

The second reason is that the weights were made only for the English version of the scales. It was assumed that translation into other languages would not change the weights. This is questionable, the French word 'heureux' may denote a higher degree of happiness than the English word 'happy'

A third weakness is that the judges rated response options irrespective of the context in the response scale. Yet the degree of happiness with the response option 'very happy' will be greater when this is the highest answer possible, than when it come next to 'completely happy' on the scale.

Lastly the study covered only a limited number of survey questions and a small number of judges.

Aim of this new weighing study
In this investigation our aim is to go through the questions again for getting more accurate estimates of the relative values of verbal response options used to survey questions about happiness. In this study we will apply a better methodology on a much larger scale. The resulting values will be fed into the Item Bank and used to re-compute observed responses.

1)  A third strategy not discussed here is to transform the score on one question into the score on another. This can be done if both questions have been posed on the same subjects so that a regression equation can be estimated. We used this method in the construction of the variable lsbw_90s. See the States of Nations codebook under 'happiness in nations'. This method applies only if rich data is available


