Annual Meeting of the NCI Cohort Consortium (Abstract Submission): Submission #13
Submission information
Submission Number: 13
Submission ID: 127583
Submission UUID: 370c5d24-f379-4b85-9d20-69afbb707912
Submission URI: /egrp/cohortconsortium/abstracts
Submission Update: /egrp/cohortconsortium/abstracts?token=fnXJaRkyO0SdnFVAO0pBi8KwW1rBW8dI0c5gELVkaAg
Created: Fri, 09/13/2024 - 16:50
Completed: Fri, 09/13/2024 - 16:55
Changed: Mon, 09/16/2024 - 16:40
Remote IP address: 10.208.28.69
Submitted by: Anonymous
Language: English
Is draft: No
Webform: Cohort 2024 (Abstracts Submission)
serial: '13' sid: '127583' uuid: 370c5d24-f379-4b85-9d20-69afbb707912 uri: /egrp/cohortconsortium/abstracts created: '1726260653' completed: '1726260931' changed: '1726519258' in_draft: '0' current_page: '' remote_addr: 10.208.28.69 uid: '0' langcode: en webform_id: cohort_2024_abstracts_submission entity_type: node entity_id: '1467' locked: '0' sticky: '0' notes: '' data: additional_authors: - add_author_degree: PhD add_author_first_name: Alejandro add_author_last_name: 'Molina-Villegas ' add_author_middle: '' add_author_organization: CONAHCyT-CentroGeo - add_author_degree: MS add_author_first_name: Karla add_author_last_name: Valdez-Trejo add_author_middle: '' add_author_organization: 'Instituto Nacional de Salud Publica' - add_author_degree: PhD add_author_first_name: Pablo add_author_last_name: Lopez-Ramires add_author_middle: '' add_author_organization: CentroGeo - add_author_degree: PhD add_author_first_name: Alberto add_author_last_name: Simpser add_author_middle: '' add_author_organization: ITAM - add_author_degree: MS add_author_first_name: Adrian add_author_last_name: Cortes-Valencia add_author_middle: '' add_author_organization: 'Instituto Nacional de Salud Publica' - add_author_degree: PhD add_author_first_name: Dalia add_author_last_name: Stern add_author_middle: '' add_author_organization: 'CONAHCyT-Instituto Nacional de Salud Publica' - add_author_degree: PhD add_author_first_name: Karla add_author_last_name: Cervantes-Martinez add_author_middle: '' add_author_organization: 'Instituto Nacional de Salud Publica' - add_author_degree: PhD add_author_first_name: Liliana add_author_last_name: Gomez-Flores-Ramos add_author_middle: '' add_author_organization: 'Instituto Nacional de Salud Publica' degree_s_: 'MD, ScD' email: mlajous@insp.mx first_name: Martin last_name: Lajous organization: 'Instituto Nacional de Salud Publica' poster_title: 'An Efficient Pipeline-Based Geocoding Approach to Handle Self-Reported Addresses in a Large Population-based Cancer Cohort in Mexico' short_biography_: | Background. Geocoding participants’ addresses in epidemiologic cohorts is now highly accurate in high-income countries. Non-standardized address notation, lack of address registries, and limitations on geocoding resources are important challenges for geocoding in limited resource settings. We aimed to develop an efficient pipeline-based geocoding approach to handle self-reported addresses from participants in a cancer cohort in Mexico, assess the validity of coordinate assignment, and maximize geocoding success. Methods. We obtained self-reported addresses at baseline in 2006-2008 from 104,003 participants in the Mexican Teachers’ Cohort (n=115,275). After cleaning and standardization, we optimized processing times by splitting the data (651,668 candidate coordinates) and creating 105 Amazon AWS virtual machines to submit queries asynchronously to the ArcGIS REST API. We conducted geospatial verification by projecting candidate coordinates through spatial join operation on Mexico’s official neighborhood vector shapefile. We compared similarities between the self-reported and API-derived addresses using string alignment scoring metrics. To assess accuracy of the procedure we compared address coordinates to residential block-centroid coordinates available in the 2006 national voting registry database. Results. After discarding non-valid coordinates and conducting geospatial verification and similarity scoring, we assigned coordinates to 101,704 study participants. When we compared assigned coordinates to voting registry block-centroid coordinates for 81,270 participants, the median distance between coordinates was 0.17 km (inter quartile range, 0.06-0.77). We maximized geocoding to 111,299 (97%) study participants by assigning voting registry-defined coordinates to 9,595 participants without a valid address. Conclusions. Address-level geocoding based on self-reported addresses can be efficiently achieved in large-scale epidemiological studies in Mexico. title: Faculty-Researcher