Annual Meeting of the NCI Cohort Consortium (Abstract Submission): Submission #13

Submission information
Submission Number: 13
Submission ID: 127583
Submission UUID: 370c5d24-f379-4b85-9d20-69afbb707912

Created: Fri, 09/13/2024 - 16:50
Completed: Fri, 09/13/2024 - 16:55
Changed: Mon, 09/16/2024 - 16:40

Remote IP address: 10.208.28.69
Submitted by: Anonymous
Language: English

Is draft: No
serial: '13'
sid: '127583'
uuid: 370c5d24-f379-4b85-9d20-69afbb707912
uri: /egrp/cohortconsortium/abstracts
created: '1726260653'
completed: '1726260931'
changed: '1726519258'
in_draft: '0'
current_page: ''
remote_addr: 10.208.28.69
uid: '0'
langcode: en
webform_id: cohort_2024_abstracts_submission
entity_type: node
entity_id: '1467'
locked: '0'
sticky: '0'
notes: ''
data:
  additional_authors:
    - add_author_degree: PhD
      add_author_first_name: Alejandro
      add_author_last_name: 'Molina-Villegas '
      add_author_middle: ''
      add_author_organization: CONAHCyT-CentroGeo
    - add_author_degree: MS
      add_author_first_name: Karla
      add_author_last_name: Valdez-Trejo
      add_author_middle: ''
      add_author_organization: 'Instituto Nacional de Salud Publica'
    - add_author_degree: PhD
      add_author_first_name: Pablo
      add_author_last_name: Lopez-Ramires
      add_author_middle: ''
      add_author_organization: CentroGeo
    - add_author_degree: PhD
      add_author_first_name: Alberto
      add_author_last_name: Simpser
      add_author_middle: ''
      add_author_organization: ITAM
    - add_author_degree: MS
      add_author_first_name: Adrian
      add_author_last_name: Cortes-Valencia
      add_author_middle: ''
      add_author_organization: 'Instituto Nacional de Salud Publica'
    - add_author_degree: PhD
      add_author_first_name: Dalia
      add_author_last_name: Stern
      add_author_middle: ''
      add_author_organization: 'CONAHCyT-Instituto Nacional de Salud Publica'
    - add_author_degree: PhD
      add_author_first_name: Karla
      add_author_last_name: Cervantes-Martinez
      add_author_middle: ''
      add_author_organization: 'Instituto Nacional de Salud Publica'
    - add_author_degree: PhD
      add_author_first_name: Liliana
      add_author_last_name: Gomez-Flores-Ramos
      add_author_middle: ''
      add_author_organization: 'Instituto Nacional de Salud Publica'
  degree_s_: 'MD, ScD'
  email: mlajous@insp.mx
  first_name: Martin
  last_name: Lajous
  organization: 'Instituto Nacional de Salud Publica'
  poster_title: 'An Efficient Pipeline-Based Geocoding Approach to Handle Self-Reported Addresses in a Large Population-based Cancer Cohort in Mexico'
  short_biography_: |
    Background. Geocoding participants’ addresses in epidemiologic cohorts is now highly accurate in high-income countries. Non-standardized address notation, lack of address registries, and limitations on geocoding resources are important challenges for geocoding in limited resource settings. We aimed to develop an efficient pipeline-based geocoding approach to handle self-reported addresses from participants in a cancer cohort in Mexico, assess the validity of coordinate assignment, and maximize geocoding success.

    Methods. We obtained self-reported addresses at baseline in 2006-2008 from 104,003 participants in the Mexican Teachers’ Cohort (n=115,275). After cleaning and standardization, we optimized processing times by splitting the data (651,668 candidate coordinates) and creating 105 Amazon AWS virtual machines to submit queries asynchronously to the ArcGIS REST API. We conducted geospatial verification by projecting candidate coordinates through spatial join operation on Mexico’s official neighborhood vector shapefile. We compared similarities between the self-reported and API-derived addresses using string alignment scoring metrics. To assess accuracy of the procedure we compared address coordinates to residential block-centroid coordinates available in the 2006 national voting registry database.

    Results. After discarding non-valid coordinates and conducting geospatial verification and similarity scoring, we assigned coordinates to 101,704 study participants. When we compared assigned coordinates to voting registry block-centroid coordinates for 81,270 participants, the median distance between coordinates was 0.17 km (inter quartile range, 0.06-0.77). We maximized geocoding to 111,299 (97%) study participants by assigning voting registry-defined coordinates to 9,595 participants without a valid address. 

    Conclusions. Address-level geocoding based on self-reported addresses can be efficiently achieved in large-scale epidemiological studies in Mexico. 
  title: Faculty-Researcher