Title: | 1000 Genomes Project Metadata |
---|---|
Description: | Metadata about populations and data about samples from the 1000 Genomes Project, including the 2,504 samples sequenced for the Phase 3 release and the expanded collection of 3,202 samples with 602 additional trios. The data is described in Auton et al. (2015) <doi:10.1038/nature15393> and Byrska-Bishop et al. (2022) <doi:10.1016/j.cell.2022.08.004>, and raw data is available at <http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/>. See Turner (2022) <doi:10.48550/arXiv.2210.00539> for more details. |
Authors: | Stephen Turner [aut, cre] |
Maintainer: | Stephen Turner <[email protected]> |
License: | Apache License (>= 2) |
Version: | 1.1.1 |
Built: | 2025-02-15 03:08:39 UTC |
Source: | https://github.com/stephenturner/kgp |
Population metadata from 212 populations from the 1000 Genomes Project (kgp), Simons Genome Diversity Project (sgdp), Human Genome Diversity Project (hgdp), and Gambian Genome Variation Project (ggvp).
allmeta
allmeta
A tibble with 212 rows and 8 columns:
Short population code
Short region code
Long population description
Long region description
Color for plotting this region on a map
Population latitude
Population longitude
Which dataset (kgp = 1000 Genomes Project; ggvp = Gambian Genome Variation Project; hgdp = Human Genome Diversity Project; Simons Genome Diversity Project).
Byrska-Bishop, Marta, et al. "High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios." Cell 185.18 (2022): 3426-3440.
1000 Genomes Project Consortium. "A global reference for human genetic variation." Nature 526.7571 (2015): 68.
Clarke, Laura, et al. "The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data." Nucleic acids research 45.D1 (2017): D854-D859.
License information is available at https://github.com/igsr/1000Genomes_data_indexes/blob/master/LICENSE. The 1000 Genomes data is made publicly available according to the Fort Lauderdale Agreement (https://www.genome.gov/Pages/Research/WellcomeReport0303.pdf).
Sample, pedigree, and population data for 2,504 samples in the Phase 3 release of the 1000 Genomes Project data.
kgp3
kgp3
A tibble with 2504 rows and 10 columns:
Family ID
Individual ID
Paternal ID
Maternal ID
Sex (1=Male, 2=Female)
Sex as a factor
Short population code
Short region code
Long population description
Long region description
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/
Byrska-Bishop, Marta, et al. "High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios." Cell 185.18 (2022): 3426-3440.
1000 Genomes Project Consortium. "A global reference for human genetic variation." Nature 526.7571 (2015): 68.
License information is available at https://github.com/igsr/1000Genomes_data_indexes/blob/master/LICENSE. The 1000 Genomes data is made publicly available according to the Fort Lauderdale Agreement (https://www.genome.gov/Pages/Research/WellcomeReport0303.pdf).
Sample, pedigree, and population data for 3,202 samples in the expanded 1000 Genomes Project data.
kgpe
kgpe
A tibble with 3202 rows and 11 columns:
Family ID
Individual ID
Paternal ID
Maternal ID
Sex (1=Male, 2=Female)
Sex as a factor
Short population code
Short region code
Long population description
Long region description
Logical; indicates whether this sample is included in the Phase 3 release data
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/
Byrska-Bishop, Marta, et al. "High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios." Cell 185.18 (2022): 3426-3440.
1000 Genomes Project Consortium. "A global reference for human genetic variation." Nature 526.7571 (2015): 68.
License information is available at https://github.com/igsr/1000Genomes_data_indexes/blob/master/LICENSE. The 1000 Genomes data is made publicly available according to the Fort Lauderdale Agreement (https://www.genome.gov/Pages/Research/WellcomeReport0303.pdf).
Population metadata from 26 populations across five continental regions.
kgpmeta
kgpmeta
A tibble with 26 rows and 7 columns:
Short population code
Short region code
Long population description
Long region description
Color for plotting this region on a map
Population latitude
Population longitude
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/
Byrska-Bishop, Marta, et al. "High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios." Cell 185.18 (2022): 3426-3440.
1000 Genomes Project Consortium. "A global reference for human genetic variation." Nature 526.7571 (2015): 68.
License information is available at https://github.com/igsr/1000Genomes_data_indexes/blob/master/LICENSE. The 1000 Genomes data is made publicly available according to the Fort Lauderdale Agreement (https://www.genome.gov/Pages/Research/WellcomeReport0303.pdf).