Skip to content

Data science resources

There are a vast range of tools and resources available to mental health data scientists. The below provides a highlight of some of those available - however if you would like to suggest another resource for this list please contact us

Items

Data platforms & search engines

Birthcohorts.net

This website provides information about design and data on the existing birth cohorts in a comparable form and are easily accessible.

CLOSER Discovery 

CLOSER Discovery is an online resource that enables researchers to view and appraise data from eight leading UK longitudinal studies.

Dementias Platform UK Cohort Tools

Dementias platform UK gathers descriptions of cohorts in one place and develops systems for accessing the data quicker and easier.

Health Data Finder for Research

The health data finder provides information about routinely collected healthcare datasets available for research in the UK.

MRC Cohort Directory

The site includes all major general population cohorts with over 1,000 participants in the UK and has a freetext and filter search function (including one for mental health). There are currently 43 studies listed on the site and 34 of these report collecting mental health measures.  

UK Data Service

The UK Data Service is a single point of access to a wide range of secondary data including large-scale government surveys, international macrodata, business microdata, qualitative studies and census data from 1971 to 2011. All are backed with extensive support, training and guidance to meet the needs of data users, owners and creators. The UK Data Service promotes data sharing to encourage the reuse of data, and provides expertise in developing best practice for research data management.

Longitudinal studies and cohorts

Avon Longitudinal Study of Parents and Children (ALSPAC) (1991)

The Avon Longitudinal Study of Parents and Children (ALSPAC) was launched in 1991 to help understand the genetic and environmental factors involved in the development of particular diseases. Researchers have been following 14, 500 children born in the early 1990s and their parents. The long-term, large-scale population study will monitor the health and lifestyle of the children, and that of their offspring, until they reach 70 years old. 

Dunedin Multidisciplinary Health & Development Study (1972-1973)

The Dunedin Longitudinal Study is a long-running cohort study of 1037 people born over the course of a year in Dunedin, New Zealand. The original pool of study members were selected from those born between 1 April 1972 and 31 March 1973 and still living in the Otago region 3 years later. Study members were assessed at age three, and then at ages 5, 7, 9, 11, 13, 15, 18, 21, 26, 32 and, most recently, at age 38 (2010-2012). Future assessments are scheduled for ages 44 and 50. 

Generation Scotland (GS)

GS has created an ethically sound research resource to support medical research and identify the genetic basis of common complex diseases. It incorporates three Research Tissue Banks: the Scottish Family Health Study (GS:SFHS), Genetic Health in the 21st Century (GS:21CGH) and the Donor DNA Databank (GS:3D). Together these projects have recruited a cohort of over 30,000 people. 

Genomics England

Genomics England, with the consent of participants and the support of the public, is sequencing 100,000 genomes of cancer and rare disease patients and their families.

Lothian Birth Cohorts (1921 & 1936)

The Lothian Birth Cohorts of 1921 and 1936 are follow-up studies of the Scottish Mental Surveys of 1932 and 1947. The surveys had, respectively, tested the intelligence of almost every child born in 1921 or 1936 and attending school in Scotland in the month of June in those years. Therefore, tracing, recruiting and re-testing people who had taken part in the Surveys offered a rare opportunity to examine the distribution and causes of cognitive ageing across most of the human life course. 

Million Veteran Program (MVP)

MVP is a national, voluntary research program to study how genes affect health. MVP will build one of the world's largest medical databases by safely collecting blood samples and health information from one million Veteran volunteers. 

MRC Cohort Directory

The site includes all major general population cohorts with over 1,000 participants in the UK and has a freetext and filter search function (including one for mental health). There are currently 43 studies listed on the site and 34 of these report collecting mental health measures.  

Neuroscience in Psychiatry Network (NSPN)

NSPN is a new venture from the University of Cambridge and University College London, which launched in November 2014. The NSPN is researching how the adolescent mind and brain develops into early adulthood. Over 2,000 young people have been recruited into the study, across Cambridge and London. 

Twins Early Development Study (TEDS) (1994-1996)

The Twins Early Development Study (TEDS) is a large-scale longitudinal study of twins (born between 1994-1996) from early childhood through adolescence. The twins were assessed longitudinally at 2, 3, 4, 7, 9, 10, 12, 14, 16 and currently 18 years of age in order to investigate genetic and environmental contributions to change and continuity in language, cognitive and academic abilities and behaviour problems from multivariate quantitative and molecular genetic perspectives. 

TwinsUK

TwinsUK is based at the Department of Twin Research, King’s College London. It currently comprises a total of 12,000 identical and non-identical twins from right across the UK with ages between sixteen and ninety eight. Female twins predominate and overall the mean age is in the mid fifties. It is now the UK’s only adult twin registry and the most clinical detailed in the world. The breath of research the cohort has supported has expanded over the years to cover the genetics of a wide range of common complex traits, and the TwinsUK cohort is now probably the most genotyped and phenotyped in the world. 

Philadelphia Neurodevelopmental Cohort (PNC)

The Philadelphia Neurodevelopmental research initiative focuses on characterizing brain and behaviour interaction with genetics. The PNC includes a population-based sample of over 9500 individuals from the greater Philadelphia area, ages 8-21 years who received medical care at the CHOP network. 

UK Biobank

UK Biobank is a major national health resource, and a registered charity in its own right, with the aim of improving the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses – including cancer, heart diseases, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression and forms of dementia. UK Biobank recruited 500,000 people aged between 40-69 years in 2006-2010 from across the country to take part in this project.

 

Routinely collected data

Child Outcomes Research Consortium (CORC)

CORC started in 2002 as a joint initiative between five founding services (Bedfordshire & Luton; Leeds; Enfield, Barnet & Haringey; Tavistock & Portman; and Hertfordshire), and was from the outset, a collaboration between front-line clinicians, managers and administrative leads. It collects outcome data for research and quality improvement. 

Clinical Record Interactive Search (CRIS)

An initiative of the South London and Maudsley NHS Trust, the Clinical Record Interactive Search (CRIS) database now has over 200,000 fully-electronic, detailed, and anonymised mental health records, making it the most in-depth mental health data resource in Europe, and possibly the world.

Clinical Practice Research Datalink (CPRD)

CPRD provides anonymised primary care records for health research. It is a governmental, not-for-profit research service, jointly funded by the NHS National Institute for Health Research (NIHR) and the Medicines and Healthcare products Regulatory Agency (MHRA).

Secure Anonymised Information Linkage Databank

SAIL is an initiative developed by Swansea University Medical School and it receives core-funding from Health and Care Research Wales of the Welsh Government. The main aim of SAIL is to realise the potential of electronically held, person based, routinely-collected, de-identified information. SAIL works collaboratively with researchers, service professionals and industry to conduct and support research and to improve service delivery. 

The Health Improvement Network Database (THIN Database)

THIN Database provides access to anonymised primary care records for research. It covers 6.2 % of the UK population. 

UK Data Service

The UK Data Service is a comprehensive resource funded by the ESRC to support researchers, teachers and policymakers who depend on high-quality social and economic data. It is a single point of access to a wide range of secondary data including large-scale government surveys, international macrodata, business microdata, qualitative studies and census data from 1971 to 2011. All are backed with extensive support, training and guidance to meet the needs of data users, owners and creators. The UK Data Service promotes data sharing to encourage the reuse of data, and provide expertise in developing best practice for research data management.

Tissue banks and brain imaging data

Tissue banks

Autism BrainNet 

Autism BrainNet, launched in 2014 in collaboration with the science and advocacy organization Autism Speaks and the Autism Science Foundation, aims to provide scientists with well-characterized, high-quality brain tissue for study.

European Bank for Induced Pluripotent Stem Cells

EBiSC is designed to address the increasing demand by iPSC researchers for quality-controlled, disease-relevant research grade iPSC lines, data and cell services. Its goal is to demonstrate an operational banking and distribution service of iPSC lines after 3 years and to establish for Europe a centralised, not-for-profit bank providing all qualified users with access to scalable, cost-efficient and customised products.

MRC UK Brain Banks Network 

The UK Brain Banks Network has catalogued tissue from over 10,000 brains, from all of the ten UK brain banks in one central database.

NIH NeuroBioBank

NIH NeuroBioBank facilitates research advancement through the collection and distribution of human post-mortem brain tissue. The NBB coordinates the collection, evaluation, processing, storage and distribution of nervous system tissue and associated clinical data via a network of brain and tissue repositories that span the United States for use by the broader research community for the study of neurological, psychiatric and developmental disorders.

Simons Foundation Autism Research Initiative (SFARI)

SFARI Base is a central database of clinical and genetic information about families affected by autism and other neurodevelopmental disorders, provided as part of the Simons Foundation Autism Research Initiative (SFARI). The database contains phenotype data from the Simons Simplex Collection and the Simons Variation in Individuals Project, and some individuals within the collections provide biospecimens, imaging data and the opportunity to contact them for additional research.

UKCRC Tissue Directory

The UKCRC (UK Clinical Research Collaboration) Tissue Directory is available for medical researchers wanting to source human tissue for their studies. The Directory is intended to be a first port of call for researchers needing human samples; its functionality to search for both collections and capabilities.

Brain imaging data

Brain Genomics Superstruct Project (GSP)

The Brain Genomics Superstruct Project provides a carefully vetted collection of neuroimaging, behaviour, cognitive, and personality data for over 1,500 human participants. Each neuroimaging data set includes one high-resolution Magnetic Resonance Imaging (MRI) acquisition and one or more resting-state functional MRI acquisitions. Each functional acquisition is accompanied by a fully automated quality assessment and pre-computed brain morphometrics are also provided.

Enigma Consortium

Enigma is a global alliance of over 500 scientists collectively analysing, pooling and comparing brain imaging, clinical, and genetic data. The consortium currently has working groups studying 12 major brain diseases including schizophrenia and depression.

IMAGEN Consortium

IMAGEN is a European Research Project which aims to identify and learn more about biological and environmental factors that might have an influence on mental health in teenagers. Genome-wide and MRI data have been collected along with behavioural and clinical data in 2000 fourteen-year olds, and follow-up is currently underway.

Consent to contact for studies and other resources

Consent to contact for studies

Clinical Record Interactive Search (CRIS)

An initiative of the South London and Maudsley NHS Trust, the Clinical Record Interactive Search (CRIS) database now has over 200,000 fully-electronic, detailed, and anonymised mental health records, making it the most in-depth mental health data resource in Europe, and possibly the world.

NeuroScience in Psychiatry Network

NSPN is a new venture from the University of Cambridge and University College London, which launched in November 2014. The NSPN is researching how the adolescent mind and brain develops into early adulthood. Over 2,000 young people have been recruited into the study, across Cambridge and London. 

NIHR BioResource Centre

Each BioResource has established groups of healthy volunteers and patients who have provided samples (of blood or saliva) and agreed to be recalled by genotype and phenotype to participate in medical research and trials.        

National Centre for Mental Health (NCMH)

NCMH brings together world-leading researchers from Cardiff, Swansea and Bangor Universities to learn more about the triggers and causes of mental health problems. They are collecting information and biological samples from thousands of volunteers from across Wales to enable research that will help improve understanding and treatment of mental illness.

Simons Foundation Autism Research Initiative (SFARI)

SFARI Base is a central database of clinical and genetic information about families affected by autism and other neurodevelopmental disorders, provided as part of the Simons Foundation Autism Research Initiative (SFARI). The database contains phenotype data from the Simons Simplex Collection and the Simons Variation in Individuals Project, and some individuals within the collections provide biospecimens, imaging data and the opportunity to contact them for additional research.

UK Clinical Trials Gateway

The UK Clinical Trials Gateway is run by the National Institute for Health Research. Members of the general public can register their consent to be contacted for clinical trials. Researchers can open a research account with the UK Clinical Trials Gateway to access the volunteer database.

Other resources

IoPPN Psychometrics and Measurement Lab

The IoPPN Psychometrics and Measurement Lab is based at the Biostatistics and Health Informatics Department at the Institute of Psychiatry, Psychology & Neuroscience (King’s College London). The goal of the Lab is to make readily accessible research in the field of psychometrics through a database in which one can search for scales, inventories and measures either developed, refined or translated specifically at the IoPPN.

Subscribe to our newsletter. Get the latest news on mental health.

© MQ: Transforming mental health 2016 | Registered charity in England / Wales: 1139916 & Scotland: SC046075 | Company number: 7406055