Combining Data to Understand Violence: A New Approach

EnglandWales, United KingdomWed Jan 15 2025
Violence research often faces hurdles due to data access and safety concerns. While individual studies exist, combining data sources is rare. Ideally, linking data from the same individuals over time could help understand violence experiences and their impacts. This paper demonstrates how to create a synthetic dataset by merging data from the Crime Survey for England and Wales (CSEW) and administrative data from Rape Crisis England and Wales (RCEW). The goal was to fill in missing information from one dataset using the other's distribution. By borrowing information from CSEW, we filled gaps in the RCEW data, resulting in a combined RCEW-CSEW dataset. Using a method called look-alike modeling, we offer a cost-effective way to explore violence patterns across different sectors. This approach treats data integration as a missing data problem, employing multiple imputations to combine datasets. We tested this method by comparing regression analyses on various variables (binary, continuous, categorical) and an outcome measure. Our results showed that the combined dataset's effect sizes matched those from the dataset used for imputation, though with higher variance and fewer significant estimates. This suggests that linking administrative and survey datasets with look-alike methods can overcome data linkage barriers.
https://localnews.ai/article/combining-data-to-understand-violence-a-new-approach-c29cdb17

questions

    Did the datasets have to go on a blind date before agreeing to combine their info?
    How does the synthetic dataset created by combining CSEW and RCEW data fare in terms of accuracy when compared to individual datasets?
    What are the potential biases introduced by combining datasets with different methodologies?

actions