3  Synthetic Data

3.1 Data preparation process

The synthetic data is part of the docker container and already prepared in the right directories.

3.2 Table example

The tables have these columns:

Patient data (saved as cases.csv)

patient_num start_date ICD10 domain
num date (y-m-d h:m:s) char char

Demographic data (saved as dems_cases.csv)

patient_num age sex_cd race_cd ethnicity_cd CHARLSON_INDEX
num num char (F/M) char char num

3.3 Directory guide

The directory tree in the docker container looks as follows:

  • /home/rstudio(working directory)
    • data
      • synthetic_data
        • cases.csv
        • dems_cases.csv
    • output
    • scripts