TwinLife Documentation TwinLife Documentation TwinLife Documentation
  • Overview and getting started
    • Overview (Table of content)
    • Getting started
    • Data access & documentation
  • 1. About TwinLife
    • 1.1 Basic Concept
    • 1.2 Study Design and Sample Structure
    • 1.3 Where and how to get the Data
  • 2. Documentation of the study
    • 2.1 Data Documentation Website and ShortGuide
    • 2.2 Documentation within the Data Sets
    • 2.3 paneldata.org
    • 2.4 Codebooks
    • 2.5 Technical Report Series, Methodology Reports, and Working Paper Series
  • 3. Data Structure
    • 3.1 Data Formats and Data Files
    • 3.2 Person Types
    • 3.3 System of Variable Names
    • 3.4 ID Variables, Wave and Data Collection Identifiers
    • 3.5 Missing Types and their Meanings
    • 3.6 Delivered Para Data
    • 3.7 Weights
    • 3.8 Pecularities of Data
    • 3.9 How to match the Data Files
    • 3.10 Matching information from the parent-about-child questionnaire to the child's data set
  • 4. Check Routines
    • 4.1 Check routines
    • 4.2 Data Adjustment
  • 5. Generated Variables and Scales
    • 5.1 Generated Variables
    • 5.2 Generated Scales
  • 6. Publications and Citation
    • 6.1 Publications and Literature Database
    • 6.2 Citation
  • 7. Useful Links
  • Terms and Privacy
  • Downloads
  • Overview and getting started
    • Overview (Table of content)
    • Getting started
    • Data access & documentation
  • 1. About TwinLife
    • 1.1 Basic Concept
    • 1.2 Study Design and Sample Structure
    • 1.3 Where and how to get the Data
  • 2. Documentation of the study
    • 2.1 Data Documentation Website and ShortGuide
    • 2.2 Documentation within the Data Sets
    • 2.3 paneldata.org
    • 2.4 Codebooks
    • 2.5 Technical Report Series, Methodology Reports, and Working Paper Series
  • 3. Data Structure
    • 3.1 Data Formats and Data Files
    • 3.2 Person Types
    • 3.3 System of Variable Names
    • 3.4 ID Variables, Wave and Data Collection Identifiers
    • 3.5 Missing Types and their Meanings
    • 3.6 Delivered Para Data
    • 3.7 Weights
    • 3.8 Pecularities of Data
    • 3.9 How to match the Data Files
    • 3.10 Matching information from the parent-about-child questionnaire to the child's data set
  • 4. Check Routines
    • 4.1 Check routines
    • 4.2 Data Adjustment
  • 5. Generated Variables and Scales
    • 5.1 Generated Variables
    • 5.2 Generated Scales
  • 6. Publications and Citation
    • 6.1 Publications and Literature Database
    • 6.2 Citation
  • 7. Useful Links

1.2 Study Design and Sample Structure

  • Print
  • Email
  • Data collection began in 2014 with a population-based sample of 4,097 twin families. The cross-sequential survey design (see Figure 2) contains four twin birth cohorts with ~1,000 same-sex (both monozygotic and dizygotic) twin pairs.

    Face-to-face interviews within the households take place every other year, and telephone interviews are conducted in the consecutive years. In the Face-to-face interviews data was collected using a mixed-mode design: Participants were surveyed by an interviewer (computer-assisted personal interview; CAPI), by means of questionnaires on a tablet or laptop (computer-assisted self-interview; CASI), via a paper-and-pencil interview (PAPI) and/or via online questionnaires (computer-assisted web interview; CAWI).

    On the one hand, these mixed modes ensured the most suitable assessment strategy for each question type, i.e. more sensitive topics were covered via CASI. On the other hand, they allowed a certain degree of flexibility for the interviewer in order to minimize the total interview duration in the households.

    School reports and developmental check-up reports were scanned as part of the CAPI and encoded afterwards.

    Image
    Figure 2. Cross-sequential survey design.

    The cross-sequential structure is combined with an Extended Twin Family Design (ETFD, see Figure 3). The sample contains not only monozygotic and dizygotic twins, but also their biological, adoptive or foster parents, one biological, adoptive, or foster sibling (if available), as well as (if applicable) stepparents and partners of the twins. Thus, the ETFD captures both the biological family of the twins as well as the environment the twins live in.
    Image
    Figure 3. ETFD sample structure.

    There are two characteristics of data collection and sample composition, which are briefly explained here.

    First characteristic: Two subsamples

    When the study was implemented, two subsamples of the initial age cohorts (consisting of twins aged about 5, 11, 17, and 23 years) were drawn, which were born in consecutive years (see Figure 2). This was necessary to achieve a sufficiently large sample size for TwinLife based on the target population of twin families in Germany.

    As a result, the first subsample (Subsample a) consists of twins born in 1990/91, 1997, 2003 and 2009 and the second subsample (Subsample b) consists of twins born in 1992/93, 1998, 2004 and 2010. The subsample a) was interviewed for the first time in 2014 while subsample b) was interviewed first in 2015. Together, subsamples, a) and b) form the complete sample of TwinLife.

    Identifiers for the two subsamples are recorded in variable wav0100.

    Second characteristic: Two data collections are one survey wave

    Face-to-face interviews (F2F) are carried out biennially in TwinLife. In the year between, a part of the sample is surveyed via telephone interviews (CATI). The main purposes of the CATI interviews are: 1) keeping in touch with the participants, 2) updating the contact information of the families (tracking), 3) collecting some complementary data that could not be surveyed in the face-to-face interview due to time restrictions, and 4) keeping track of important life events and transitions. Therefore, each F2F data collection together with the consecutive CATI data collection including both subsamples a) and b) is defined as one survey wave.

    The identifiers for the survey waves are recorded in variable wav0200 and the identifiers for the type of data collection (F2F or CATI) are recorded in variable wav0300. The consecutive numbering for each data collection is recorded in variable wid (see Table 1 for an overview).

    wav0200
    wav0300
    wid
    wav0100
    (planned) release date
    Wave 1
    Face-to-Face 1 (F2F 1)
    Data collection 1
    Subsample a
    Subsample b
    Autumn 2017
    Telephone interview 1 (CATI1)
    Data collection 2
    Subsample a
    Subsample b
    January 2019
    Wave 2
    Face-to-Face 2 (F2F 2)
    Data collection 3
    Subsample a
    Subsample b
    Spring 2020
    Telephone interview 2 (CATI2)
    Data collection 4
    Subsample a
    Subsample b
    Winter 2020
    Wave 3
    Face-to-Face 3 (F2F 3)
    Data collection 5
    Subsample a
    Subsample b
    Winter 2021
    Telephone interview 3 (CATI3)
    Data collection 6
    Subsample a
    Subsample b
    Winter 2022
    Wave 4
    Face-to-Face 4 (F2F 4)
    Data collection 7
    Subsample a
    Subsample b
    Winter 2023
    Telephone interview 4 (CATI4)
    Data collection 8
    Subsample a
    Subsample b
    Winter 2024
    Wave 5
    Face-to-Face 5 (F2F 5)
    Data collection 9
    Subsample a
    Subsample b
    Winter 2025
    Table 1. Survey and sample structure.

    Usually, there is one data release each year, including new data of either a F2F or a CATI data collection for both subsamples.
    For further information about the TwinLife sample and sampling design, please take a look at the ➔ methodology report of the first wave (Brix et al., 2017) and the corresponding ➔ article (Lang & Kottwitz, 2020). Methodology reports of all data collections including a fieldwork description and outcomes can be found at the ➔ Downloads section of TwinLife’s documentation website.
    • Prev
    • Next
    • Terms and Privacy
    • Downloads