Release Notes for Nicotine Dependence Dist. 11.0

Introduction

NIDA Nicotine Distribution 11.0 consists of the Version 10.0 dataset appended with n=873 additional records from the Study 35 (Chen), resulting in 15593 total records in 9029 families, including dummy pedigree connector records, counting singletons (n=6929) as a family. N =13233 subjects have cell lines at IBX/RUCDR.

NIDA Nicotine Distribution 10.0 consists of the Version 9.3 dataset appended with n=1792 additional records from the Study 27 (Bierut) AAND study, resulting in 14720 total records in 8156 families, including dummy pedigree connector records, counting singletons (n=6056) as a family. N =12360 subjects have cell lines.

NIDA Nicotine Distribution 9.0 consists of the Version 8.0 dataset appended with n=124 additional records from the Study 10 (Swan) twin study, resulting in 12928 total records in 6364 families, including dummy pedigree connector records, counting singletons (n=4264) as a family. N =10565 subjects have cell lines.

NIDA Nicotine Distribution 9.1 conforms the columns of the main distribution file to a common metadata format; content remains the same.

NIDA Nicotine Distribution 9.2 added plasma_id, no other data values changed.

NIDA Nicotine Distribution 9.3 is an update that standardizes the order of columns in the distribution file; also cell_id and DNG were updated.

Notes

  1. NICSNP.  Two cell lines did not grow, which left n=1927 subjects for Distribution 1.0.  These records were removed from the distribution file but are available in the NICSNP genotype data. 
  2. NICSNP.  Case status is defined as Fagerstrom >= 4 . 
  3. NICSNP.  A dummy family id was created as sequential integers for NICSNP case-control subjects.  Subjects in both NICSNP and Study 6 were assigned their Study 6 family id. 
  4. NICSNP.  Two Study 6 individual ids in the Rutgers dataset were corrected to conform to the leading zero format of the other Study 6 individual ids.  These ids were
    1. 60-82750-002 →        60-82750-02
    2. 60-5708-001 →          60-05708-01
  5. NICSNP.  Dataset includes NICSNP control subjects.  Of the n=879 control subjects, n=818 are in families with no nicotine cases. 
  6. Study 6.  If Aboriginal ancestry was reported by any family member, the family race was set to “Other”; otherwise, the family race was set to “White”. 
  7. Study 6.  Subject 60-11344-51 genotyped XXY. 
  8. Study 6.  Genotype data indicated a father↔mother swap between father 60-29989-03 and mother 60-29989-04; Rutgers confirmed the swap occurred in the cell lines as well and corrected the database. 
  9. Study 6.  Other drug dependence defined as one or more of the following:  hallucinogen, PCP, solvent, or inhalant dependence. 
  10. Study 6.  Other drug abuse defined as one or more of the following:  hallucinogen, PCP, solvent, or inhalant abuse.  
  11. Study 6.  The dxsys variable refers to the DSM system for nicotine dependence.  All other dx variables are DSM4.  If the dx for nicotine dep was missing, and one or more other dx was not missing, dxsys was set to DSM4. 
  12. Study 6.  Variable labels are missing for 139 of the 239 study-specific interview variables.  These variables may be matched to the instrument by variable name. 
  13. Study 6.  Forty-seven variables in the interview dataset have nine SAS formats assigned.  The SAS proc format statement that defines the formats is provided with the Study 6 interview dataset and must be run before processing the interview dataset.  Otherwise, use the SAS nofmterr option. 
  14. Pedigree drawings (Study 6).  Shading of pedigree symbols.  Full shading = DSM nicotine dependence or FTND>=4.  ¾ shading = DSM alcohol dependence.   ½ shading = DSM cannabis dependence.  ¼ shading = DSM opioid, cocaine, stimulant, sedative or other drug dependence.  NICSNP data does not have DSM nicotine dependence.  Study 6 data does not have FTND. 
  15. Study 2.  No summary DSM diagnoses are provided by study, although raw items may be used to derive diagnoses.  Preliminary analysis by study showed <10% alcohol dependent. 
  16. Study 2.  Nuclear families were combined into extended families and assigned a new fam_id with an “E” in the fourth position of the fam_id.  “E” family ids were also assigned to singletons with dummy parent records added to create a trio.  The original nuclear family id (familyid) is retained in the study-specific dataset nida_study2_participants.sas7bdat.  Dummy parent ids created by the NIDA repository have a “D” in position 4 of the ind_id. 
  17. Study 2.  All families were included in distribution, even those with no members FTND>=4; however, singletons with missing FTND were removed. 
  18. Study 9: stimulant dependence defined as amphetamine dependence. 
  19. Study 9: stimulant abuse defined as amphetamine abuse. 
  20. Study 9.  Other drug dependence defined as one or more of the following:  hallucinogen, PCP, inhalant, or other dependence. 
  21. Study 9.  Other drug abuse defined as one or more of the following:  hallucinogen, PCP, inhalant, or other abuse. 
  22. Study 9.  Substance abuse raw dx data of 3=”meets criteria with exclusions” was set to 1 = “none” in the distribution file; raw data was retained in the study-specific data. 
  23. Study 9: Distribution 4.0 has Study 9 Site 25 data (Site 27 is a family study for future release), which consists of cases and controls; cases as defined by the study are included in the distribution file; control subjects were output to the NIDA Nicotine control dataset.  Study 9 defined cases as “smoked at least 5 cigarettes/day of ≥ 0.5mg nicotine for at least 5 years and has been smoking at the current rate for at least the past 6 months”.  Controls were defined as “smoked between 1 and 100 cigarettes in their lifetime without a pattern of regular smoking”.  (also see Sherva et al. Association of a single nucleotide polymorphism in neuronal acetylcholine receptor subunit alpha 5 (CHRNA5) with smoking status and with ‘pleasurable buzz’ during early experimentation with smoking.  Addiction 2008; 103: 1544–1552). 
  24. Study 9.  SAS formats saved in cport file instead of operating system specific SAS catalog, may be imported with proc cimport code provided in packet. 
  25. Study 15. The following subjects had multiple blood samples, and the cell id for the earlier sample corresponds to the genotyped blood sample so the cell_id of the earlier sample was included in the distribution dataset rather than the more recent cell_id. 
DATE RECEIVEDCELL_LINE
08OCT200303NA11299*
20APR2005   05NA18587
18NOV2003   03NA11949*
14FEB2006   06NA23389
*use earlier sample because genotyped 
  1. Study 15.  Other drug dependence was defined as “DSM4 dependence on drugs other than marijuana, cocaine or opiates”. 
  2. Study 15/ Distribution 6.0.  The ind_id (subject identifier) was set equal to the site number concatenated with the cell line id because the study id had errors in their ids that could not be resolved due to lack of documentation at the site.   Dummy parent records were created for families with more than one sib, no dummy parents for singleton sibs.   Dummy parent ids are site+family+relcode where relcode = 002 for fathers and 003 for mothers.  Note this means that Study 15 ind_id in Dist 5.0 is different than Dist 6.0 which means the two distributions will not merge. 
  3. Study 15.  Study 15 is a case-control study with additional family members; case-control status was defined for probands only; the study used several definitions of case-control status based on FTND>=4; however, the various definitions were not submitted to the NIDA Repository as variables in a dataset, although the constituent variables are in the submitted data so that if the exact algorithm were known, case status might be reconstructed.  However, the exact algorithms for deriving case-control status were not sent to the NIDA Repository.  N=19 probands are missing FTND total score.  If the study sends additional data or algorithms, they will be added in a future distribution. 
  4. Distribution 6.0.  Add control subjects from Studies 6, 9(site 25), and 15.  Future distributions will include all families with at least one family member with both clinical data and a blood sample. 
  5. Study 9 (site 27): n=21 dummy parent records and ids were created to fill out trio pedigrees when a parent record was missing, using the convention for the ind_id of “27-Dxxxx” where xxxx=a unique integer. 
  6. Study 9 (site 27): stimulant dependence defined as amphetamine dependence. 
  7. Study 9 (site 27): stimulant abuse defined as amphetamine abuse. 
  8. Study 9 (site 27).  Other drug dependence defined as one or more of the following:  hallucinogen, PCP, inhalant, or other dependence. 
  9. Study 9 (site 27).  Other drug abuse defined as one or more of the following:  hallucinogen, PCP, inhalant, or other abuse. 
  10. Study 9 (site 25, case-control): Other dependence and abuse (othdep, othabuse) were corrected so that 1=None, and 5=Affected; previous distributions coded 1=Affected, Missing=None.   
  11. Study 9 (site 27): probands are ever-smokers; therefore a trio may have no members with DSM nicotine dependence. 
  12. Study 10.  dummy parent ids and records were created for each twin pair by appending 998 (father) and 999 (mother) to the family id provided by the study, eg. 35-1234-998, 35-1234-999 for family  35-1234.  Note that twin ids are four digit sequential integers from 0003 through 0522, formatted with leading zeros, and do not contain the family id, eg. the twins ids for a family id of 35-1234 look like 35-0052 and 35-0053, while the parents look like 35-1234-998 and 35-1234-999. 
  13. Study 10: stimulant dependence defined as amphetamine dependence. 
  14. Study 10: stimulant abuse defined as amphetamine abuse. 
  15. Study 10: other drug abuse defined as Hallucinogen Abuse or Other Abuse. 
  16. Study 10: family included if any subject with both substance dependence or abuse and a cell line. 
  17. Study 27: the following samples were excluded from distribution due to low volume:

NA0047149- disposed low volume on all tubes

NA0048342- disposed low volume on all tubes

NA0050681- disposed low volume on all tubes

NA0052279- Only one ACD tube sent. Very low volume disposed

NA0052505- disposed low volume on all tubes

  1. Study 27: n=15 Case subjects have FTND 1-3 or missing FTND
  2. Study 35: family_race derived variable: study-defined Native Hawaiian/Other Pacific Islander, Other, and multi-race categorized as “Other”

Table 1: Individual study case-control status for Nicotine Distribution 11.0

StudyDesigncase-control status
2FamilyNA
NICSNP (6, 15)Case-Controlcase: FTND>=4
6FamilyNA
9 (site 25)Case-Controlcase: 5 cigs/day for 5 years and current 6 months
9 (site 27)Trioever-smoking proband
10TwinDSM-IV dependence or abuse
15Case-Control / Familyunknown
16Case-Controlcase: FTND 0-4, 5, 6-10
27Case-ControlCase: FTND>=4; Control<4 (note: n=15 Case subjects have FTND 1-3 or missing FTND
35Clinical TrialActive smoking (Cigarettes Per Day [CPD] ≥5), and exhaled Carbon Monoxide [CO] ≥8 ppm (inclusion criteria for all subjects; all subjects are cases