KTL Sarcoma study: Difference between revisions

From Opasnet
Jump to navigation Jump to search
(→‎Data management: all data management code copied here)
 
(25 intermediate revisions by the same user not shown)
Line 2: Line 2:
[[Category:Finland]]
[[Category:Finland]]
[[Category:Dioxins]]
[[Category:Dioxins]]
[[Category:TCDD project]]
[[Category:Contains R code]]
[[Category:Contains R code]]
[[Category:Code under inspection]]
[[Category:Code under inspection]]
Line 61: Line 62:
==Rationale==
==Rationale==


===Study population===
=== Methods ===
====Study population====


The majority of sarcoma patients in southern Finland are treated
The majority of sarcoma patients in southern Finland are treated
Line 127: Line 129:
age and area could be found.
age and area could be found.


===Exposure assessment===
====Exposure assessment====


From the matched 337 patients, concentrations of the 17 toxic
From the matched 337 patients, concentrations of the 17 toxic
Line 208: Line 210:
The laboratory has successfully participated in several international quality control studies for the analysis of PCDD/Fs, and PCBs. Matrices in these studies have included cow milk, human milk and human serum. (Yrjänheikki, 1991, Rymen, 1994, WHO, 1996 and Lindström et al., 2000). The laboratory of chemistry in the National Public Health Institute is an accredited testing laboratory (No T077) in Finland (EN ISO/IEC 17025). The scope of accreditation includes PCDD/Fs, non-ortho PCBs, and other PCBs from human tissue samples.
The laboratory has successfully participated in several international quality control studies for the analysis of PCDD/Fs, and PCBs. Matrices in these studies have included cow milk, human milk and human serum. (Yrjänheikki, 1991, Rymen, 1994, WHO, 1996 and Lindström et al., 2000). The laboratory of chemistry in the National Public Health Institute is an accredited testing laboratory (No T077) in Finland (EN ISO/IEC 17025). The scope of accreditation includes PCDD/Fs, non-ortho PCBs, and other PCBs from human tissue samples.


===Statistical analyses===
====Statistical analyses====


Conditional logistic regression analysis was performed with
Conditional logistic regression analysis was performed with
Line 245: Line 247:
preservatives, strong detergents, heavy metals, other chemicals.
preservatives, strong detergents, heavy metals, other chemicals.


===Simulated data===
===Data===
 
; This code was used to create a csv file that contains a simulated data from this study. When compared with the original data, the simulated data
* has the same number of observations,
* has the same range of values in each variable,
* has approximately the same correlation structure between all variables.
 
<rcode>
library(OpasnetUtils)
library(MASS)
library(mc2d)
library(reshape2)
library(ggplot2)
 
objects.get("isqT7nvhd0ViUR7d")
 
data <- objects.decode(etable, password)
colnames(data) <- t(data[1, ])
data <- data[2:nrow(data), 2:ncol(data)]
 
data2 <- data
fun <- c(rep("normal", 5), rep("poisson", 12), rep("lognormal", 19))
 
params <- list()
 
for(i in 1:ncol(data2)) {
data2[[i]] <- as.numeric(as.character(data2[[i]]))
if(i > 17) data2[[i]] <- ifelse(data2[[i]] == 0, 0.01, data2[[i]])
params[i] <- fitdistr(data2[[i]][!is.na(data2[[i]])], fun[i])
}
 
simu <- data.frame(temp = rep(NA, 968))
 
for(i in 1:5) {
simu[[i]] <- rnorm(968, params[[i]][1], params[[i]][2])
}
for(i in 6:17) {
simu[[i]] <- rpois(968, params[[i]])
}
for(i in 18:36) {
simu[[i]] <- rlnorm(968, params[[i]][1], params[[i]][2])
}
simu[[3]] <- rbern(968, 0.5) + 1
 
colnames(simu) <- colnames(data)
 
korre <- cor(x = data2, use = "pairwise.complete.obs", method = "spearman")
 
simu <- as.data.frame(cornode(as.matrix(simu), target = korre))
 
korre2 <- cor(x = simu, use = "pairwise.complete.obs", method = "spearman")
 
qplot(melt(korre)$value, melt(korre2)$value)
 
for(i in 1:ncol(simu)) {
simu[[i]] <- ifelse(
simu[[i]] > max(data[[i]], na.rm = TRUE) |
simu[[i]] < min(data[[i]], na.rm = TRUE),
NA, simu[[i]]
)
}
 
for(i in 1:ncol(data2)) {print(paste(
min(data2[[i]], na.rm = TRUE),
max(data2[[i]], na.rm = TRUE),
min(simu[[i]], na.rm = TRUE),
max(simu[[i]], na.rm = TRUE)
))}
 
</rcode>
 
===Other data===


* [http://ytoswww/yhteiset/Huippuyksikko/Tutkimus/R16_sarkooma/Analyysit/Analyysi020712/Analyysi020712.xls The original data], [http://ytoswww/yhteiset/Huippuyksikko/Tutkimus/R80_Sarkooma2/Analyysi020712b.csv in csv file]
* [http://ytoswww/yhteiset/Huippuyksikko/Tutkimus/R16_sarkooma/Analyysit/Analyysi020712/Analyysi020712.xls The original data], [http://ytoswww/yhteiset/Huippuyksikko/Tutkimus/R80_Sarkooma2/Analyysi020712b.csv in csv file]
Line 355: Line 286:
</rcode>
</rcode>


===Variable information===
==== Questionnaire ====
 
* {{#l:KTL_sarcoma_questionnaire_finnish.odt}}
* {{#l:KTL_sarcoma_questionnaire_swedish.odt}}
 
====Variable information====


The variable information was originally documented in [http://ytoswww/yhteiset/Huippuyksikko/Tutkimus/R16_sarkooma/Analyysit/Analyysi020712/TuomistoAnalyysiloki20020712.txt Log file about the statistical analyses: Part 1], but unfortunately mostly in Finnish.
The variable information was originally documented in [http://ytoswww/yhteiset/Huippuyksikko/Tutkimus/R16_sarkooma/Analyysit/Analyysi020712/TuomistoAnalyysiloki20020712.txt Log file about the statistical analyses: Part 1], but unfortunately mostly in Finnish.
Line 530: Line 466:
}}
}}


===Data management===
====Data management====


'''Code to manage the data. It takes the original data files and merges them. Works only if files are available.
'''Code to manage the data. It takes the original data files and merges them. Works only if files are available.
Line 796: Line 732:
TRUE
TRUE
),
),
# Tästä välistä puuttuu Mitä.piimää. Vai puuttuuko? Onko yhdistetty maitoon?
Mitä.leivälle = list(c(
Mitä.leivälle = list(c(
"En mitään",
"En mitään",
Line 999: Line 936:


ffq <- opbase.data("Op_en2721", subset = "Portions per month")
ffq <- opbase.data("Op_en2721", subset = "Portions per month")
ffq$Result <- ffq$Result / 30
ffq$Result <- ffq$Result
ffq$Obs <- NULL
ffq$Obs <- NULL


Line 1,038: Line 975:
}}
}}


=== Intake data ===
==== Interpretations ====
 
The consumption of hard fat is calculated in the following way (Q## means the value from the survey question; I## means the interpretation from the table below; Q24&I24 means that question Q24 is quantified by using interpretation from I24 with matching values; Q23*I23 means that the survey value and interpretation are multiplied.
 
total_fat = (Q23a*I23 + Q23b*I23) * Q24&I24 + Q25&I25 * Q26&I26 * Q21a&I21 + Q27&I27 * 20
 
The code assumes that a person uses 20 g/d fat for cooking. Q23: how much a) milk, b) sourmilk; Q24: What kind of milk; Q25 what fat on bread; Q26: how much fat on bread; Q27: what fat for cooking.
 
The following assumptions are used to interpret survey answers:
 
<t2b name="Assumptions for calculations" index="Variable,Value,Unit" obs="Result" desc="Description,Vastaus suomeksi" unit="-">
Q23||dl per glass|2|Size of a glass of milk or sourmilk|
Q24|1|fat g/dl|0.035|full milk, fat g/dl|täysmaitoa
Q24|2|fat g/dl|0.015|light milk, fat g/dl|kevytmaitoa
Q24|3|fat g/dl|0.01|1% milk, fat g/dl|ykkösmaitoa
Q24|4|fat g/dl|0|fat-free milk|rasvatonta maitoa
Q24|5|fat g/dl|0|fat-free sourmilk|rasvatonta piimää tai kirnupiimää
Q24|6|fat g/dl|0.01|other sourmilk fat g/dl|muuta piimää
Q24|7|fat g/dl|0|none of these|en juo maitoa enkä piimää
Q25|1|hard fat, proportion|0|none|en mitään
Q25|2|hard fat, proportion|0.15|soft margarine, share of hard fat|kasvimargariinia
Q25|3|hard fat, proportion|0.5|oil-butter-mix, share of hard fat|Voi-kasvirasvaseosta
Q25|4|hard fat, proportion|1|butter|voita
Q26|1|fat g /slice of bread|0|0 g per slice of bread|en lainkaan
Q26|2|fat g /slice of bread|3|3 g per slice of bread|10 g per 3 viipaletta
Q26|3|fat g /slice of bread|7|7 g per slice of bread|10 g per 1-2 viipaletta
Q26|4|fat g /slice of bread|15|15 g per slice of bread|Yli 10 g per viipale
Q27|1|hard fat fraction|0|hard fat fraction in the baking fat used|kasviöljyä
Q27|2|hard fat fraction|0.15|hard fat fraction in the baking fat used|kasvimargariinia
Q27|3|hard fat fraction|0.5|hard fat fraction in the baking fat used|talousmargariinia
Q27|4|hard fat fraction|0.5|hard fat fraction in the baking fat used|Voi-kasvirasvaseosta
Q27|5|hard fat fraction|1|hard fat fraction in the baking fat used|voita
Q27|6|hard fat fraction|0|hard fat fraction in the baking fat used|ei mitään rasvaa
Q35|1|alcohol times /a|300||päivittäin
Q35|2|alcohol times /a|100||muutaman kerran viikossa
Q35|3|alcohol times /a|50||noin kerran viikossa
Q35|4|alcohol times /a|25||pari kertaa kuukaudessa
Q35|5|alcohol times /a|12||noin kerran kuukaudessa
Q35|6|alcohol times /a|6||noin kerran parissa kuukaudessa
Q35|7|alcohol times /a|4||3-4 kertaa vuodessa
Q35|8|alcohol times /a|2||pari kertaa vuodessa
Q35|9|alcohol times /a|1||kerran vuodessa tai harvemmin
Q35|10|alcohol times /a|0||en koskaan
Q36|1|alcohol portion |0|g alcohol|vähemmän kuin yhden
Q36|2|alcohol portion |12|g alcohol|1 annoksen
Q36|3|alcohol portion |24|g alcohol|2 annosta
Q36|4|alcohol portion |36|g alcohol|3 annosta
Q36|5|alcohol portion |55|g alcohol|4-5 annosta
Q36|6|alcohol portion |96|g alcohol|6-10 annosta
Q36|7|alcohol portion |150|g alcohol|Yli 10 annosta
Q21a|1|g/day carbohydrates|1.5|carbohydrates per day of 100 g bread slices|leipää 100 g viipaleina. oletus: 50% hiilihydraattia
Q21a|2|g/day carbohydrates|2.5|carbohydrates per day of 100 g bread slices|leipää 100 g viipaleina.
Q21a|3|g/day carbohydrates|7.5|carbohydrates per day of 100 g bread slices|leipää 100 g viipaleina.
Q21a|4|g/day carbohydrates|15|carbohydrates per day of 100 g bread slices|leipää 100 g viipaleina.
Q21a|5|g/day carbohydrates|50|carbohydrates per day of 100 g bread slices|leipää 100 g viipaleina.
Q21a|6|g/day carbohydrates|100|carbohydrates per day of 100 g bread slices|leipää 100 g viipaleina.
Q21b|1|g/day carbohydrates|0.84|carbohydrates per day of 200 g porridge|puuroa 200 g annoksina. oletus: 70% hiilihydraattia viljasta, jota 20%
Q21b|2|g/day carbohydrates|1.4|carbohydrates per day of 200 g porridge|puuroa 200 g annoksina.
Q21b|3|g/day carbohydrates|4.2|carbohydrates per day of 200 g porridge|puuroa 200 g annoksina.
Q21b|4|g/day carbohydrates|8.4|carbohydrates per day of 200 g porridge|puuroa 200 g annoksina.
Q21b|5|g/day carbohydrates|28|carbohydrates per day of 200 g porridge|puuroa 200 g annoksina.
Q21b|6|g/day carbohydrates|56|carbohydrates per day of 200 g porridge|puuroa 200 g annoksina.
Q21c|1|g/day carbohydrates|1.2|carbohydrates per day of 200 g pasta|pastaa 200 g annoksina. oletus: 80% hiilihydraattia viljasta, jota 25%
Q21c|2|g/day carbohydrates|2|carbohydrates per day of 200 g pasta|pastaa 200 g annoksina.
Q21c|3|g/day carbohydrates|6|carbohydrates per day of 200 g pasta|pastaa 200 g annoksina.
Q21c|4|g/day carbohydrates|12|carbohydrates per day of 200 g pasta|pastaa 200 g annoksina.
Q21c|5|g/day carbohydrates|40|carbohydrates per day of 200 g pasta|pastaa 200 g annoksina.
Q21c|6|g/day carbohydrates|80|carbohydrates per day of 200 g pasta|pastaa 200 g annoksina.
Q21d|1|g/day carbohydrates|1.26|carbohydrates per day of 200 g musli etc|muita (mysli ym). oletus: 70% hiilihydraattia viljasta, jota 30%
Q21d|2|g/day carbohydrates|2.1|carbohydrates per day of 200 g musli etc|muita (mysli ym).
Q21d|3|g/day carbohydrates|6.3|carbohydrates per day of 200 g musli etc|muita (mysli ym).
Q21d|4|g/day carbohydrates|12.6|carbohydrates per day of 200 g musli etc|muita (mysli ym).
Q21d|5|g/day carbohydrates|42|carbohydrates per day of 200 g musli etc|muita (mysli ym).
Q21d|6|g/day carbohydrates|84|carbohydrates per day of 200 g musli etc|muita (mysli ym).
Q21e|1|g/day carbohydrates|0.3|carbohydrates per day of 200 g youghurt etc|viiliä tai jugurttia, sokeri. oletus: 5% hiilihydraattia (Doc. Geigy s. 479)
Q21e|2|g/day carbohydrates|0.5|carbohydrates per day of 200 g youghurt etc|viiliä tai jugurttia, sokeri.
Q21e|3|g/day carbohydrates|1.5|carbohydrates per day of 200 g youghurt etc|viiliä tai jugurttia, sokeri.
Q21e|4|g/day carbohydrates|3|carbohydrates per day of 200 g youghurt etc|viiliä tai jugurttia, sokeri.
Q21e|5|g/day carbohydrates|10|carbohydrates per day of 200 g youghurt etc|viiliä tai jugurttia, sokeri.
Q21e|6|g/day carbohydrates|20|carbohydrates per day of 200 g youghurt etc|viiliä tai jugurttia, sokeri.
Q21f|1|g/day carbohydrates|0.015|carbohydrates per 50 g cheese|vähärasv. juusto, sokeri.
Q21f|2|g/day carbohydrates|0.025|carbohydrates per 50 g cheese|vähärasv. juusto, sokeri. oletus: 1% hiilihydraattia (Doc. Geigy s. 479)
Q21f|3|g/day carbohydrates|0.075|carbohydrates per 50 g cheese|vähärasv. juusto, sokeri.
Q21f|4|g/day carbohydrates|0.15|carbohydrates per 50 g cheese|vähärasv. juusto, sokeri.
Q21f|5|g/day carbohydrates|0.5|carbohydrates per 50 g cheese|vähärasv. juusto, sokeri.
Q21f|6|g/day carbohydrates|1|carbohydrates per 50 g cheese|vähärasv. juusto, sokeri.
Q21g|1|g/day carbohydrates|0.015|carbohydrates per 50 g cheese|muu juusto, sokeri. oletus: 1% hiilihydraattia (Doc. Geigy s. 479)
Q21g|2|g/day carbohydrates|0.025|carbohydrates per 50 g cheese|muu juusto, sokeri.
Q21g|3|g/day carbohydrates|0.075|carbohydrates per 50 g cheese|muu juusto, sokeri.
Q21g|4|g/day carbohydrates|0.15|carbohydrates per 50 g cheese|muu juusto, sokeri.
Q21g|5|g/day carbohydrates|0.5|carbohydrates per 50 g cheese|muu juusto, sokeri.
Q21g|6|g/day carbohydrates|1|carbohydrates per 50 g cheese|muu juusto, sokeri.
Q21h|1|g/day carbohydrates|0.3|carbohydrates per 100 g ice cream|jäätelöä. oletus: 10% hiilihydraattia
Q21h|2|g/day carbohydrates|0.5|carbohydrates per 100 g ice cream|jäätelöä.
Q21h|3|g/day carbohydrates|1.5|carbohydrates per 100 g ice cream|jäätelöä.
Q21h|4|g/day carbohydrates|3|carbohydrates per 100 g ice cream|jäätelöä.
Q21h|5|g/day carbohydrates|10|carbohydrates per 100 g ice cream|jäätelöä.
Q21h|6|g/day carbohydrates|20|carbohydrates per 100 g ice cream|jäätelöä.
Q21i|1|g/day hard fat|0.12|hard fat per 200 g youghurt etc|viiliä tai jugurttia, rasva. oletus: 2 % rasvaa
Q21i|2|g/day hard fat|0.2|hard fat per 200 g youghurt etc|viiliä tai jugurttia, rasva.
Q21i|3|g/day hard fat|0.6|hard fat per 200 g youghurt etc|viiliä tai jugurttia, rasva.
Q21i|4|g/day hard fat|1.2|hard fat per 200 g youghurt etc|viiliä tai jugurttia, rasva.
Q21i|5|g/day hard fat|4|hard fat per 200 g youghurt etc|viiliä tai jugurttia, rasva.
Q21i|6|g/day hard fat|8|hard fat per 200 g youghurt etc|viiliä tai jugurttia, rasva.
Q21j|1|g/day hard fat|0.15|hard fat per 50 g low-fat cheese|vähärasvainen juusto. oletus: 10% rasvaa
Q21j|2|g/day hard fat|0.25|hard fat per 50 g low-fat cheese|vähärasvainen juusto.
Q21j|3|g/day hard fat|0.75|hard fat per 50 g low-fat cheese|vähärasvainen juusto.
Q21j|4|g/day hard fat|1.5|hard fat per 50 g low-fat cheese|vähärasvainen juusto.
Q21j|5|g/day hard fat|5|hard fat per 50 g low-fat cheese|vähärasvainen juusto.
Q21j|6|g/day hard fat|10|hard fat per 50 g low-fat cheese|vähärasvainen juusto.
Q21k|1|g/day hard fat|0.45|hard fat per 50 g cheese|juusto. oletus: 30% rasvaa (fineli)
Q21k|2|g/day hard fat|0.75|hard fat per 50 g cheese|juusto.
Q21k|3|g/day hard fat|2.25|hard fat per 50 g cheese|juusto.
Q21k|4|g/day hard fat|4.5|hard fat per 50 g cheese|juusto.
Q21k|5|g/day hard fat|15|hard fat per 50 g cheese|juusto.
Q21k|6|g/day hard fat|30|hard fat per 50 g cheese|juusto.
Q21l|1|g/day hard fat|0.3|hard fat per 100 g ice cream|jäätelöä. oletus: 10% rasvaa
Q21l|2|g/day hard fat|0.5|hard fat per 100 g ice cream|jäätelöä.
Q21l|3|g/day hard fat|1.5|hard fat per 100 g ice cream|jäätelöä.
Q21l|4|g/day hard fat|3|hard fat per 100 g ice cream|jäätelöä.
Q21l|5|g/day hard fat|10|hard fat per 100 g ice cream|jäätelöä.
Q21l|6|g/day hard fat|20|hard fat per 100 g ice cream|jäätelöä.
Q21m|1|g/day hard fat|0.45|hard fat per 100 g meat |liharuokaa. oletus: 15% rasvaa (Doc. Geigy s. 481)
Q21m|2|g/day hard fat|0.75|hard fat per 100 g meat |liharuokaa.
Q21m|3|g/day hard fat|2.25|hard fat per 100 g meat |liharuokaa.
Q21m|4|g/day hard fat|4.5|hard fat per 100 g meat |liharuokaa.
Q21m|5|g/day hard fat|15|hard fat per 100 g meat |liharuokaa.
Q21m|6|g/day hard fat|30|hard fat per 100 g meat |liharuokaa.
Q21n|1|g/day hard fat|0.15|hard fat per 100 g meat |kalaruokaa. oletus: 5% kovaa rasvaa, finelin mukaan 2-5%
Q21n|2|g/day hard fat|0.25|hard fat per 100 g meat |kalaruokaa.
Q21n|3|g/day hard fat|0.75|hard fat per 100 g meat |kalaruokaa.
Q21n|4|g/day hard fat|1.5|hard fat per 100 g meat |kalaruokaa.
Q21n|5|g/day hard fat|5|hard fat per 100 g meat |kalaruokaa.
Q21n|6|g/day hard fat|10|hard fat per 100 g meat |kalaruokaa.
</t2b>


How much mass, energy, and dioxin does one portion contain? Data are guesswork of from [http://www.fineli.fi Fineli].
How much mass, energy, and dioxin does one portion contain? Data are guesswork of from [http://www.fineli.fi Fineli].


<t2b name="Food energy and dioxin" index="Food,Observation" locations="Mass,Energy,Dioxin" unit="g,kJ,pg/portion">
<t2b name="Food energy and dioxin" index="Food,Observation" locations="Mass,Energy,Dioxin" unit="g,kJ,pg/portion">
Kalaa|100|600|7
Silakkaa|100|792|470
Petokalaa|100|301|25
Petokalaa|100|301|25
Muikkua|100|750|28
Muikkua|100|750|28
Line 1,062: Line 1,135:
Jäätelöä|150|1200|0.03
Jäätelöä|150|1200|0.03
Liharuokaa|150|1400|1.5
Liharuokaa|150|1400|1.5
Maitoa|200|358|0.004
Piimää|200|358|0.004
</t2b>
</t2b>


Line 1,074: Line 1,149:
Kerran päivässä tai useammin|40
Kerran päivässä tai useammin|40
</t2b>
</t2b>
=== Analyses ===
====Simulated data====
; This code was used to create a csv file that contains a simulated data from this study. When compared with the original data, the simulated data
* has the same number of observations,
* has the same range of values in each variable,
* has approximately the same correlation structure between all variables.
<rcode>
library(OpasnetUtils)
library(MASS)
library(mc2d)
library(reshape2)
library(ggplot2)
objects.get("isqT7nvhd0ViUR7d")
data <- objects.decode(etable, password)
colnames(data) <- t(data[1, ])
data <- data[2:nrow(data), 2:ncol(data)]
data2 <- data
fun <- c(rep("normal", 5), rep("poisson", 12), rep("lognormal", 19))
params <- list()
for(i in 1:ncol(data2)) {
data2[[i]] <- as.numeric(as.character(data2[[i]]))
if(i > 17) data2[[i]] <- ifelse(data2[[i]] == 0, 0.01, data2[[i]])
params[i] <- fitdistr(data2[[i]][!is.na(data2[[i]])], fun[i])
}
simu <- data.frame(temp = rep(NA, 968))
for(i in 1:5) {
simu[[i]] <- rnorm(968, params[[i]][1], params[[i]][2])
}
for(i in 6:17) {
simu[[i]] <- rpois(968, params[[i]])
}
for(i in 18:36) {
simu[[i]] <- rlnorm(968, params[[i]][1], params[[i]][2])
}
simu[[3]] <- rbern(968, 0.5) + 1
colnames(simu) <- colnames(data)
korre <- cor(x = data2, use = "pairwise.complete.obs", method = "spearman")
simu <- as.data.frame(cornode(as.matrix(simu), target = korre))
korre2 <- cor(x = simu, use = "pairwise.complete.obs", method = "spearman")
qplot(melt(korre)$value, melt(korre2)$value)
for(i in 1:ncol(simu)) {
simu[[i]] <- ifelse(
simu[[i]] > max(data[[i]], na.rm = TRUE) |
simu[[i]] < min(data[[i]], na.rm = TRUE),
NA, simu[[i]]
)
}
for(i in 1:ncol(data2)) {print(paste(
min(data2[[i]], na.rm = TRUE),
max(data2[[i]], na.rm = TRUE),
min(simu[[i]], na.rm = TRUE),
max(simu[[i]], na.rm = TRUE)
))}
</rcode>
==== POPs and obesity ====
Dioxins and PCBs have been assosiated to type 2 diabetes. Do dioxins cause diabetes, or do diabetes decrease dioxin elimination, or does obesity increase diabetes and decrease dioxin elimination, or something else? We tried to make sense of this by looking at sarcoma study data.
<rcode label="Code does not work without the data file" embed=1>
library(ggplot2)
dat <- re.ad.csv("V:/TUSO/Projects/POPit ja lihavuus/Sarkoomakyselydata/Copy of sarkooma_kysely_ja_dioksiinit_korjattu.csv")
dat$Diet <- dat$Rasvaa.maitotuotteista + dat$k21liha + dat$k21kala * 8
hist(dat$Diet)
dat$Diet3 <- cut(dat$Diet, 3)
ggplot(dat, aes(x = ika, y = IntakePCDDFTEQ, colour = Diet3)) +
  geom_point() + geom_smooth()
ggplot(dat, aes(x = ika, y = PCDDFWHO05TEQ, colour = Diet3)) +
  geom_point() + geom_smooth()
</rcode>
==== Self-reported chemical exposure ====
We looked at self-reported chemical exposure, especially pesticides and wood preservatives.
We also looked at the impact of self-reported occupation, recoded into 9 groups. This is best done in the unmatched dataset, but also some analyses were done with the matched dataset. Age was the only clearly significant variable, with sarcoma risk increasing by 8 % per year. Male gender seemed to increase the risk but was not statistically significant. None of the differences between occupation groups were statistically significant, and they did not show a pattern where putatively chemically-exposured groups would have higher risk.
<rcode label="Code does not work without the data file">
#################
# Bring in the hand-made occupation classification
d <- read.csv("V:/TUSO/Projects/Sarkooma/Analyysit/Kyselykaavaka_ammatti-tyo_edit.csv")
#colnames(d)
#[1] "N"                        "ID"                      "Työntekijäryhmä"          "Luokitus..koodi.lopussa."
#[5] "Alle.5.v.työhistoria"    "Huomattavaa"              "Ammatti"                  "Työpaikka"             
#[9] "Kesto"                    "Työtehtävä"              "AmmattiA"                "TyöpaikkaA"             
#[13] "KestoA"                  "AmmattiB"                "TyöpaikkaB"              "KestoB"                 
#[17] "AmmattiC"                "TyöpaikkaC"              "KestoC"                  "AmmattiD"               
#[21] "TyöpaikkaD"              "KestoD"                 
lev <- as.character(d[974:982,4])
d <- d[1:969,c(2,4,5,6)]
d <- d[d$ID != "" , ] # Remove empty row 883
colnames(d) <- c("ID", "Tyoluokka", "Alle5v", "Huom.tyo")
d$Tyoluokka <- factor(d$Tyoluokka, levels = 1:9, labels = lev)
d$Tyoalt <- ifelse(as.numeric(d$Tyoluokka) %in% c(1,2,9), "Ei",
                  ifelse(as.numeric(d$Tyoluokka) %in% c(3,8), "Ehkä", "Kyllä"))
#> levels(d$Tyoluokka)
#[1] "Opiskelija"            "Sisätyö"                "Hoitoala"              "Maa- ja metsätalous" 
#[5] "Sotilas, palomies ym"  "Teollisuustyö"          "Rakennusala, ulkotyö"  "Kauppa, elintarvikeala"
#[9] "Työtön tai ei tietoa" 
###################################
library(lme4)
# Data from //helfs01.thl.fi/groups2/TUSO/Projects/POPit ja lihavuus/Dioksiinit vs sarkooma/Data.xlsx
dat <- read.csv("V:/TUSO/Projects/Sarkooma/Analyysit/Data_2.12.2016.csv", encoding = "UTF-8")
names(dat)
dat$PCDDFWHO05TEQ <- dat$PCDDFWHO05TEQ / 20 # Scale to a nominal interquartile range (ca. 19.5 pg/g fat, depending on subgroup)
#Pekan malli clogit-funktiolla
library("survival")
# A conditional regression with new occupation classification. Regression method as below.
#> sum(as.character(d$ID) != as.character(dat$ID))
#[1] 0
# Because rows are identically ordered, just cbind the occupation data without redundant ID.
dat <- cbind(dat, d[-1])
dat$Tyoluokka <- relevel(dat$Tyoluokka, "Sisätyö")
dat$Alle5v <- ifelse(dat$Alle5v == "1", "Yes", "No")
table(dat[c("Tyoluokka","Sarcoma.unmatched","Alle5v")], useNA = "ifany")
clogit(Sarcoma.unmatched ~ Sex + Age + Tyoluokka + Alle5v, # + PCDDFWHO05TEQ
      #strata(Sarcoma.matched.pair),
      method="exact", data = dat
)
clogit(Sarcoma.unmatched ~ Sex + Age + Tyoluokka, # + PCDDFWHO05TEQ
      #strata(Sarcoma.matched.pair),
      method="exact", data = dat[dat$Alle5v == "No",]
)
clogit(Sarcoma.matched ~ Sex + Tyoalt + # Tyoluokka + # + PCDDFWHO05TEQ
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat
)
clogit(Sarcoma.matched ~ Tyoluokka + # + PCDDFWHO05TEQ # No sex to avoid too many subgroups
        strata(Sarcoma.matched.pair),
      method="exact", data = dat
)
# The analysis above does not give reliable results because warning: Loglik converged before variable 1,2,3,4,5,6,7,8
temp <- list()
for(i in levels(dat$Tyoluokka)) {
  dat$Temp <- ifelse(dat$Tyoluokka == i, TRUE, FALSE)
  print(i)
  temp2 <- clogit(Sarcoma.matched ~ Sex + Temp + # Tyoluokka + # + PCDDFWHO05TEQ
          strata(Sarcoma.matched.pair),
        method="exact", data = dat
  )
  print(summary(temp2)$conf.int)
  temp <- rbind(
    temp,
    data.frame(
      Tyoluokka = i,
      summary(temp2)$conf.int,
      Pvalue = summary(temp2)$coefficients[,5]
    )
  )
}
# The analysis above compares one group of Tyoluokka to all others in the matched data set.
# A better analysis is below with unmatched analysis.
temp
# On the other hand, questionnaire was collected from everyone, so matching can be removed (unlike with dioxins)
# without altering the design. Let's try what happens without matching.
clogit(Sarcoma.unmatched ~ Sex + Age + Tyoluokka, # + PCDDFWHO05TEQ
              #strata(Sarcoma.matched.pair),
          method="exact", data = dat
)
table(dat[c("Tyoluokka", "Sarcoma.matched")])
table(dat[c("Sarcoma.matched","Sarcoma.unmatched")], useNA = "ifany")
#exact estimation. Tuottaa saman tuloksen kuin Riikalla.
# Several different models were run. All included Sex as a confounder.
# Four pairs of models looked at each chemical risk separately (Analysis: Separate),
# and dioxin risk in the respective population.
# Four models looked at each chemical + dioxin in a combined model,
# adjusting for each other (Analysis: Combined).
# Finally, one model contained all three chemicals and dioxin in a single model,
# naturally not containing the combined chemical exposure this time.
models <- list()
models[[1]] <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ + Exposure.woodpreservatives +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.woodpr == 1 , ]
)
models[[2]] <- clogit(Sarcoma.matched ~ Sex + Exposure.woodpreservatives +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.woodpr == 1 , ]
)
models[[3]] <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.woodpr == 1 , ]
)
models[[4]] <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ + Exposure.fungicidesherbicides +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.funher == 1 , ]
)
models[[5]] <- clogit(Sarcoma.matched ~ Sex + Exposure.fungicidesherbicides +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.funher == 1 , ]
)
models[[6]] <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.funher == 1 , ]
)
models[[7]] <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ + Exposure.insecticides +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.insect == 1 , ]
)
models[[8]] <- clogit(Sarcoma.matched ~ Sex + Exposure.insecticides +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.insect == 1 , ]
)
models[[9]] <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.insect == 1 , ]
)
models[[10]] <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ + Exposure.any +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.any == 1 , ]
)
models[[11]] <- clogit(Sarcoma.matched ~ Sex + Exposure.any +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.any == 1 , ]
)
models[[12]] <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ +
                        strata(Sarcoma.matched.pair),
                      method="exact", data = dat[dat$Inclusion.criteria.any == 1 , ]
)
out <- data.frame()
for(i in 1:length(models)) {
  out <- rbind(
    out,
    cbind(
      as.data.frame(summary(models[[i]])$coefficients),
      as.data.frame(summary(models[[i]])$conf.int)
    )
  )
}
out$Subpop <- rep(c(
  "Wood preservatives",
  "Fungicides, herbicides",
  "Insecticides",
  "Any of above"),
  each = 7
)
print(out, digits = 3)
out <- out[c(2,3,5,7,9,10,12,14,16,17,19,21,23,24,26,28) , c(2, 8, 9, 5, 10)]
out$Analysis <- rep(c("Combined","Separate"), each = 2, times = 4)
out <- out[order(out$Subpop, rownames(out), out$Analysis) , ]
print(out, digits = 3)
#### Analysis where all chemicals are in a single model.
mo <- clogit(Sarcoma.matched ~ Sex + PCDDFWHO05TEQ + Exposure.insecticides + Exposure.fungicidesherbicides + Exposure.woodpreservatives +
              strata(Sarcoma.matched.pair),
            method="exact", data = dat#[dat$Inclusion.criteria.any == 1 , ]
)
summary(mo)
# Fungicides/herbicides clearly elevate the risk and is statistically significant
# Woodpreservatives also shows a high risk but is only marginally significant.
# Insecticides and dioxin are not associated with higher risk.
# Chemicals are  moderately correlated with each other and somewhat with dioxin
# as shown by the correlation table below.
cor(dat[
  dat$Inclusion.criteria.any == 1 ,
  c("Exposure.any",
    "Exposure.fungicidesherbicides",
    "Exposure.insecticides",
    "Exposure.woodpreservatives",
    "PCDDFWHO05TEQ"
  )],
  use = "pairwise.complete.obs"
)
# Is Baltic herring an independent risk factor for sarcoma?
# Well, the risk is increased but clearly non-significant (OR 1.388, 95 % CI 0.8063 - 2.389)
dat$Silakka <- as.numeric(dat$Silakkaa) > 3
table(dat$Silakkaa, dat$Silakka)
fit <- clogit(Sarcoma.matched ~ Sex + Silakka +
                strata(Sarcoma.matched.pair),
              method="exact", data = dat
)
summary(fit)
# A question was raised why the PCDDWHO05TEQ estimates for wood-preservative group and any-exposure group were identical.
# The reason can be seen from here:
table(dat[c(
  "Inclusion.criteria.woodpr",
  "Inclusion.criteria.any",
  "Sarcoma.matched")],
  exclude = NULL
)
# The two groups are practically identical with only two additional controls in the any-exposure group.
# These two controls do not much change the PCDDFWHO05TEQ impact on sarcoma, and therefore the estimates are the same
# with precision of three decimals.
table(dat[c(
  "Inclusion.criteria.insect",
  "Inclusion.criteria.any",
  "Sarcoma.matched")],
  exclude = NULL
)
# However, with insecticides, there are three controls and TWO CASES more in any-exposure group and that does change estimates.
# With herbicides and fungicides, there are one control and two cases more, also enough to change estimates.
</rcode>
==== Correlation of dioxin and fish ====
How do individual dioxin congeners correlate with individual fish parametres in the questionnaire?
<rcode label="Code does not work without the data file">
# This is code Op_en2721/ on page [[KTL Sarcoma study]]
library(lme4)
library(Hmisc)
library(MASS)
library(ggplot2)
library(rjags)
# Data from //helfs01.thl.fi/groups2/TUSO/Projects/POPit ja lihavuus/Dioksiinit vs sarkooma/Data.xlsx
dat <- read.csv("V:/TUSO/Projects/Sarkooma/Analyysit/Data_2.12.2016.csv", encoding = "UTF-8")
names(dat)
kalat <- colnames(dat)[c(29, 27, 79:89)] # All: 27, 29, 79:89
dioksiinit <- colnames(dat)[c(310:326, 365)] # All: 310:368)
dat[1:20,kalat]
dat[1:20,dioksiinit]
colnames(dat)
unique(unlist(lapply(dat[kalat], FUN = levels)))
inn <- c(
  "",
  "En lainkaan",
  "Harvemmin kuin kerran kuukaudessa tai en lainkaan",
  "Harvemmin kuin kerran kuukaudessa",
  "Kerran tai pari kuukaudessa",
  "Kerran viikossa",
  "Pari kertaa viikossa",
  "Lähes joka päivä",
  "Kerran päivässä tai useammin"
)
# Datat peräisin V:\TUSO\Projects\POPit ja lihavuus\Excel-mallit\Concentration modeling.xlsx
# paitsi doses on näppituntuma
# Meals per week
doses <- c(NA, 0, 0.1, 0.15, 0.3, 1, 2, 5, 9)
# Congener half-life in years
t1.2 <- c(7.2, 11.2, 9.8, 13.1, 5.1, 4.9, 6.7, 2.1, 3.5,
          7.0, 6.4, 7.2, 7.2, 2.8, 3.1, 4.6, 1.4, 7)
# WHO2005 TEF
TEF <- c(1, 1, 0.1, 0.1, 0.1, 0.01, 0.0003, 0.1, 0.03,
        0.3, 0.1, 0.1, 0.1, 0.1, 0.01, 0.01, 0.0003, 1)
dat2 <- dat[c(kalat, dioksiinit, "Age")]
# Convert fish intake answers to units meals/week
dat2[kalat][-1] <- lapply(dat2[kalat][-1], FUN = function(x) doses[match(x, inn)])
# Convert dioxin concentrations to TEQs
dat2[dioksiinit] <- lapply(as.list(1:length(TEF)), FUN = function(x) TEF[x] * dat2[dioksiinit][[x]])
cor(x = dat2[dioksiinit], y = dat2[kalat], use = "pairwise.complete.obs")
dat3 <- dat2[!is.na(rowSums(dat2[c(kalat, dioksiinit)])) , ]
dat3.diox <- resid(lm(cbind(
  PCDDFWHO05TEQ,
  X2378.TCDD,
  X12378.PD,
  X123478.HD,
  X123678.HD,
  X123789.HD,
  X1234678.D,
  OCDD,
  X2378.TCDF,
  X12378.PF,
  X23478.PF,
  X123478.HF,
  X123678.HF,
  X123789.HF,
  X234678.HF,
  X1234678.F,
  X1234789.F,
  OCDF
) ~ Age, dat3))
correlations <- round(cor(dat3.diox, dat3[kalat]), 3)
pvalues <- round(rcorr(dat3.diox, as.matrix(dat3[kalat]))$P, 3)[1:18, -(1:18)]
fit <- lm(
  paste("cbind(", paste(dioksiinit, collapse = ","),
        ") ~ ", paste(c(kalat, "Age"), collapse = " + "),
        collapse = ""),
  data = dat3 
)
summary(fit)
out <- data.frame()
################# Explain each congener with all fish variables + Age
for(i in 1:length(dioksiinit)) {
  fit <- lm(paste(dioksiinit[[i]], "~", paste(kalat, collapse = " + "), "+ Age"), dat3)
  fit <- summary(stepAIC(fit, direction="both"))
 
  out <- rbind(
    out,
    data.frame(
      fit[[4]],
      Var = rownames(fit[[4]]),
      adj.r.squared = fit[[9]],
      Congener = dioksiinit[[i]],
      Halflife = t1.2[[i]],
      Test = "With age"
    )
  )
}
############# Age is removed from the models to see the explanatory power of fish variables alone
for(i in 1:length(dioksiinit)) {
  fit <- lm(paste(dioksiinit[[i]], "~", paste(kalat, collapse = " + ")), dat3)
  fit <- summary(stepAIC(fit, direction="both"))
 
  out <- rbind(
    out,
    data.frame(
      fit[[4]],
      Var = rownames(fit[[4]]),
      adj.r.squared = fit[[9]],
      Congener = dioksiinit[[i]],
      Halflife = t1.2[[i]],
      Test = "Without age"
    )
  )
}
################# Explain each congener with SINGLE fish variables + Age
for(i in 1:length(dioksiinit)) {
  fit <- lm(paste(dioksiinit[[i]], "~", paste(kalat[c(-1, -2)], collapse = " + "), "+ Age"), dat3)
  fit <- summary(stepAIC(fit, direction="both"))
 
  out <- rbind(
    out,
    data.frame(
      fit[[4]],
      Var = rownames(fit[[4]]),
      adj.r.squared = fit[[9]],
      Congener = dioksiinit[[i]],
      Halflife = t1.2[[i]],
      Test = "Without generic fish variables, with age"
    )
  )
}
oprint(out)
write.csv(out, "V:/TUSO/Projects/Sarkooma/lineaariregressiot.csv")
colnames(out)
head(out)
temp <- out[out$Var != "(Intercept)", ]
ggplot(temp, aes(x = Var, y = Estimate, colour = Congener, size = temp$Pr...t. < 0.05)) + geom_point()+
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))+
  facet_wrap(~ Test)
################# Bayesian approach
kal <- kalat[-1]
datb <- dat[c(kal)]#, dioksiinit, "Age")]
#x <- datb[[1]]
datb[kal] <- lapply(
  datb[kal],
  FUN = function(x) {
    factor(
      x,
      levels = if(inn[3] %in% x) inn[c(3,5:9)] else inn[c(2,4:8)],
      ordered = TRUE
    )
  }
)
test <- as.data.frame(lapply(datb, FUN = is.na))
datb <- datb[rowSums(test) < 8 , ]
datb <- as.data.frame(lapply(datb, FUN = function(x) as.numeric(x) -1))
#table(datb[1:2])
datblong <- melt(datb, measure.vars = 1:12)
ggplot(datblong, aes(x = value, weight = 1))+geom_bar()+facet_wrap(~ variable)
correlations <- cor(datb, use = "pairwise.complete.obs")
melt(correlations, measure.vars = 1:12)
pvalues <- round(rcorr(as.matrix(datb))$P, 3)
mod <- textConnection("
  model{
  for(j in 1:12) {
    for(i in 1:N) {
      datbb[i , j] ~ dbin(p[j], 6) #Six alternatives in each question
    }
    p[j] ~ dunif(0,1)
  }
}
")
# A binomial distribution is assumed for bins of answer choices.
jags <- jags.model(
  mod,
  data = list(
    N = length(datb),
    datbb = datb
  ),
  n.chains = 4,
  n.adapt = 100
)
update(jags, 1000)
samps <- jags.samples(jags, 'p', 1000)
samps.coda <- coda.samples(jags, 'p', 1000)
plot(samps.coda[[1]])
head(samps.coda)
summary(samps.coda)
hist(cor(samps.coda[[1]])[cor(samps.coda[[1]]) != 1])
### In practice, all correlations between -0.05 and 0.05 -> meaningless
# Important corralations can be (and have been above) calculated directly from data.
probs <- colSums(samps.coda[[1]]) / nrow(samps.coda[[1]])
gramm <- data.frame(
  Fish = kal,
  P = probs,
  Answer = rep(0:6, each = length(kal)),
  Freq = dbinom(rep(0:6, each = length(kal)), 6, rep(probs, 7))
)
ggplot(gramm, aes(x = Answer, weight = Freq))+geom_bar()+facet_wrap(~ Fish)
</rcode>
These estimates are based on the code above.
<t2b name="Binomial distribution parameter" index="Fish" obs="Parameter" unit="probability">
Kalaa|0.23179078
Petokalaa|0.17746642
Muikkua|0.14939457
Sisävesikalaa|0.09493785
Kirjolohta|0.29775961
Silakkaa|0.2152247
Itämeren.lohta|0.08175282
Muuta.Itämerestä|0.0282421
Pakastekalaa|0.21510166
Kalasäilykkeitä|0.21598007
Valtamerikalaa|0.1078198
Äyriäisiä|0.16021416
</t2b>
'''Correlation coefficients between fish dishes
{{hidden|
<t2b name="Correlation coefficients between fishes" index="Food1,Food2" obs="Coefficient" unit="correlation">
Kalaa|Kalaa|1
Petokalaa|Kalaa|0.37831574
Muikkua|Kalaa|0.286657251
Sisävesikalaa|Kalaa|0.365112639
Kirjolohta|Kalaa|0.530184225
Silakkaa|Kalaa|0.425923276
Itämeren.lohta|Kalaa|0.305152183
Muuta.Itämerestä|Kalaa|0.2111844
Pakastekalaa|Kalaa|0.207057983
Kalasäilykkeitä|Kalaa|0.21677219
Valtamerikalaa|Kalaa|0.273916098
Äyriäisiä|Kalaa|0.143764271
Kalaa|Petokalaa|0.37831574
Petokalaa|Petokalaa|1
Muikkua|Petokalaa|0.422402864
Sisävesikalaa|Petokalaa|0.499609578
Kirjolohta|Petokalaa|0.205863144
Silakkaa|Petokalaa|0.202035751
Itämeren.lohta|Petokalaa|0.079332036
Muuta.Itämerestä|Petokalaa|0.10229554
Pakastekalaa|Petokalaa|-0.055693699
Kalasäilykkeitä|Petokalaa|0.049204262
Valtamerikalaa|Petokalaa|0.080707577
Äyriäisiä|Petokalaa|0.082444392
Kalaa|Muikkua|0.286657251
Petokalaa|Muikkua|0.422402864
Muikkua|Muikkua|1
Sisävesikalaa|Muikkua|0.394067217
Kirjolohta|Muikkua|0.227258746
Silakkaa|Muikkua|0.24510588
Itämeren.lohta|Muikkua|0.104823821
Muuta.Itämerestä|Muikkua|0.127881897
Pakastekalaa|Muikkua|-0.006076891
Kalasäilykkeitä|Muikkua|0.065331013
Valtamerikalaa|Muikkua|0.157485773
Äyriäisiä|Muikkua|0.072256144
Kalaa|Sisävesikalaa|0.365112639
Petokalaa|Sisävesikalaa|0.499609578
Muikkua|Sisävesikalaa|0.394067217
Sisävesikalaa|Sisävesikalaa|1
Kirjolohta|Sisävesikalaa|0.231100033
Silakkaa|Sisävesikalaa|0.219228101
Itämeren.lohta|Sisävesikalaa|0.136672786
Muuta.Itämerestä|Sisävesikalaa|0.105376927
Pakastekalaa|Sisävesikalaa|-0.013742777
Kalasäilykkeitä|Sisävesikalaa|0.106690375
Valtamerikalaa|Sisävesikalaa|0.188327229
Äyriäisiä|Sisävesikalaa|0.114591507
Kalaa|Kirjolohta|0.530184225
Petokalaa|Kirjolohta|0.205863144
Muikkua|Kirjolohta|0.227258746
Sisävesikalaa|Kirjolohta|0.231100033
Kirjolohta|Kirjolohta|1
Silakkaa|Kirjolohta|0.38967992
Itämeren.lohta|Kirjolohta|0.282808883
Muuta.Itämerestä|Kirjolohta|0.111980079
Pakastekalaa|Kirjolohta|0.166699904
Kalasäilykkeitä|Kirjolohta|0.267376587
Valtamerikalaa|Kirjolohta|0.304243078
Äyriäisiä|Kirjolohta|0.154695247
Kalaa|Silakkaa|0.425923276
Petokalaa|Silakkaa|0.202035751
Muikkua|Silakkaa|0.24510588
Sisävesikalaa|Silakkaa|0.219228101
Kirjolohta|Silakkaa|0.38967992
Silakkaa|Silakkaa|1
Itämeren.lohta|Silakkaa|0.303378534
Muuta.Itämerestä|Silakkaa|0.317929228
Pakastekalaa|Silakkaa|0.106271923
Kalasäilykkeitä|Silakkaa|0.202491349
Valtamerikalaa|Silakkaa|0.27027233
Äyriäisiä|Silakkaa|0.17446606
Kalaa|Itämeren.lohta|0.305152183
Petokalaa|Itämeren.lohta|0.079332036
Muikkua|Itämeren.lohta|0.104823821
Sisävesikalaa|Itämeren.lohta|0.136672786
Kirjolohta|Itämeren.lohta|0.282808883
Silakkaa|Itämeren.lohta|0.303378534
Itämeren.lohta|Itämeren.lohta|1
Muuta.Itämerestä|Itämeren.lohta|0.589928968
Pakastekalaa|Itämeren.lohta|0.083123081
Kalasäilykkeitä|Itämeren.lohta|0.167824298
Valtamerikalaa|Itämeren.lohta|0.415932209
Äyriäisiä|Itämeren.lohta|0.312466845
Kalaa|Muuta.Itämerestä|0.2111844
Petokalaa|Muuta.Itämerestä|0.10229554
Muikkua|Muuta.Itämerestä|0.127881897
Sisävesikalaa|Muuta.Itämerestä|0.105376927
Kirjolohta|Muuta.Itämerestä|0.111980079
Silakkaa|Muuta.Itämerestä|0.317929228
Itämeren.lohta|Muuta.Itämerestä|0.589928968
Muuta.Itämerestä|Muuta.Itämerestä|1
Pakastekalaa|Muuta.Itämerestä|0.061080508
Kalasäilykkeitä|Muuta.Itämerestä|0.164729369
Valtamerikalaa|Muuta.Itämerestä|0.381394106
Äyriäisiä|Muuta.Itämerestä|0.295389062
Kalaa|Pakastekalaa|0.207057983
Petokalaa|Pakastekalaa|-0.055693699
Muikkua|Pakastekalaa|-0.006076891
Sisävesikalaa|Pakastekalaa|-0.013742777
Kirjolohta|Pakastekalaa|0.166699904
Silakkaa|Pakastekalaa|0.106271923
Itämeren.lohta|Pakastekalaa|0.083123081
Muuta.Itämerestä|Pakastekalaa|0.061080508
Pakastekalaa|Pakastekalaa|1
Kalasäilykkeitä|Pakastekalaa|0.345409216
Valtamerikalaa|Pakastekalaa|0.129614914
Äyriäisiä|Pakastekalaa|0.125383855
Kalaa|Kalasäilykkeitä|0.21677219
Petokalaa|Kalasäilykkeitä|0.049204262
Muikkua|Kalasäilykkeitä|0.065331013
Sisävesikalaa|Kalasäilykkeitä|0.106690375
Kirjolohta|Kalasäilykkeitä|0.267376587
Silakkaa|Kalasäilykkeitä|0.202491349
Itämeren.lohta|Kalasäilykkeitä|0.167824298
Muuta.Itämerestä|Kalasäilykkeitä|0.164729369
Pakastekalaa|Kalasäilykkeitä|0.345409216
Kalasäilykkeitä|Kalasäilykkeitä|1
Valtamerikalaa|Kalasäilykkeitä|0.212919187
Äyriäisiä|Kalasäilykkeitä|0.296174683
Kalaa|Valtamerikalaa|0.273916098
Petokalaa|Valtamerikalaa|0.080707577
Muikkua|Valtamerikalaa|0.157485773
Sisävesikalaa|Valtamerikalaa|0.188327229
Kirjolohta|Valtamerikalaa|0.304243078
Silakkaa|Valtamerikalaa|0.27027233
Itämeren.lohta|Valtamerikalaa|0.415932209
Muuta.Itämerestä|Valtamerikalaa|0.381394106
Pakastekalaa|Valtamerikalaa|0.129614914
Kalasäilykkeitä|Valtamerikalaa|0.212919187
Valtamerikalaa|Valtamerikalaa|1
Äyriäisiä|Valtamerikalaa|0.272081755
Kalaa|Äyriäisiä|0.143764271
Petokalaa|Äyriäisiä|0.082444392
Muikkua|Äyriäisiä|0.072256144
Sisävesikalaa|Äyriäisiä|0.114591507
Kirjolohta|Äyriäisiä|0.154695247
Silakkaa|Äyriäisiä|0.17446606
Itämeren.lohta|Äyriäisiä|0.312466845
Muuta.Itämerestä|Äyriäisiä|0.295389062
Pakastekalaa|Äyriäisiä|0.125383855
Kalasäilykkeitä|Äyriäisiä|0.296174683
Valtamerikalaa|Äyriäisiä|0.272081755
Äyriäisiä|Äyriäisiä|1
</t2b>
}}
==== EU kalat ====
* The code that used to be here was moved to [[EU-kalat#Calculations]].
* What updates should be done:
** Plot iterations to see that the model results do not drift.
** Take modelled parameters and develop a MC model to produce predicted concentrations.
*** TCDD concentration should be added to the hierearchical Bayes model for this?
** [[KTL Sarcoma study]], [[EU-kalat]] and [[Goherr: Fish consumption study]] should all be combined into one model. {{comment|# |Can models be combined as text with paste()? This could work if all submodels had unique parameter names like N.eu and N.goh rather than just N. And data lists are merged simply with c().|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 16:22, 22 January 2017 (UTC)}}
** A causal diagram should be drawn to show the model structure.
* JAGS user manual [http://www.stats.ox.ac.uk/~nicholls/MScMCMC15/jags_user_manual.pdf] (with e.g. distribution names and other guidance)
* How to generate predictions in JAGS [http://stats.stackexchange.com/questions/29932/how-to-generate-predictions-with-rjags]
* Using rjags, a simple guidance [http://www.johnmyleswhite.com/notebook/2010/08/20/using-jags-in-r-with-the-rjags-package/]
Related:
* Easily generate correlated variables from any distribution (without copulas) [https://www.r-bloggers.com/easily-generate-correlated-variables-from-any-distribution-without-copulas/]
==== Concentration-age graph with THL formatting ====
<rcode label="Run on own computer">
# This is code Op_en2721/ on page [[KTL Sarcoma study]]
library(ggplot2)
#library(thlGraphs)
thlPointPlot <- function (data, xvar, yvar, groupvar = NULL, ylabel = yvar,
                          xlabel = NULL, colors = thlColors(n = 12, type = "quali", name = "line"),
                          title = NULL, subtitle = NULL, caption = NULL,
                          legend.position = "none", base.size = 16, linewidth = 3,
                          show.grid.x = FALSE, show.grid.y = TRUE, lang = "fi", ylimits = NULL,
                          marked.treshold = 10, plot.missing = FALSE, xaxis.breaks = waiver(),
                          yaxis.breaks = waiver(), panels = FALSE, nrow.panels = 1,
                          labels.end = FALSE)
{
  lwd <- thlPtsConvert(linewidth)
  gg <- ggplot(
    data,
    aes_(x = substitute(xvar),
        y = substitute(yvar),
        group = ifelse(!is.null(substitute(groupvar)), substitute(groupvar), NA),
        colour = ifelse(!is.null(substitute(groupvar)), substitute(groupvar), ""))
  ) # + geom_line(size = lwd) #!!!!!!!!!!!!!!!!!!!!
  if (isTRUE(plot.missing)) {
    df <- thlNaLines(
      data = data, xvar = deparse(substitute(xvar)),
      yvar = deparse(substitute(yvar)),
      groupvar = unlist(ifelse(deparse(substitute(groupvar)) != "NULL", deparse(substitute(groupvar)), list(NULL)))
    )
    if (!is.null(df) & FALSE) { ##!!!!!!!!!!!!!!!!!!!!!!!!!!!!
      gg <- gg + geom_line(
        data = df, aes_(
          x = substitute(xvar),
          y = substitute(yvar),
          group = ifelse(!is.null(substitute(groupvar)), substitute(groupvar), NA),
          colour = ifelse(!is.null(substitute(groupvar)), substitute(groupvar), "")
        ),
        linetype = 2,
        size = lwd
      )
    }
  }
  if (!is.null(marked.treshold)) {
    if (length(unique(data[, deparse(substitute(xvar))])) > marked.treshold) {
      if (is.factor(data[, deparse(substitute(xvar))]) ||
          is.character(data[, deparse(substitute(xvar))]) ||
          is.logical(data[, deparse(substitute(xvar))])) {
        levs <- levels(factor(data[, deparse(substitute(xvar))]))
        min <- levs[1]
        max <- levs[length(levs)]
      } else {
        min <- min(data[, deparse(substitute(xvar))])
        max <- max(data[, deparse(substitute(xvar))])
      }
      subdata <- data[c(data[, deparse(substitute(xvar))] %in% c(min, max)), ]
      gg <- gg + geom_point(
        data = subdata,
        aes_(
          x = substitute(xvar),
          y = substitute(yvar),
          group = ifelse(!is.null(substitute(groupvar)), substitute(groupvar), NA),
          colour = ifelse(!is.null(substitute(groupvar)), substitute(groupvar), "")
        ), stroke = 1.35 * lwd, fill = "white", shape = 21, size = 10/3 * lwd
      )
    } else {
      gg <- gg + geom_point(stroke = 1.35 * lwd, fill = "white", size = 10/3 * lwd, shape = 21)
    }
  }
  if (isTRUE(labels.end)) {
    if (is.factor(data[, deparse(substitute(xvar))]) ||
        is.character(data[, deparse(substitute(xvar))]) ||
        is.logical(data[, deparse(substitute(xvar))])) {
      levs <- levels(factor(data[, deparse(substitute(xvar))]))
      maxd <- data[data[, deparse(substitute(xvar))] == levs[length(levs)], ]
    } else {
      maxd <- data[data[, deparse(substitute(xvar))] == max(data[, deparse(substitute(xvar))]), ]
    }
    brks <- maxd[, deparse(substitute(yvar))]
    labsut <- maxd[, deparse(substitute(groupvar))]
  } else (brks <- labsut <- waiver())
  gg <- gg + ylab(ifelse(deparse(substitute(ylabel)) == "yvar", deparse(substitute(yvar)), ylabel)) +
    labs(title = title, subtitle = subtitle, caption = caption) +
    thlTheme(
      show.grid.y = show.grid.y,
      show.grid.x = show.grid.x,
      base.size = base.size,
      legend.position = legend.position,
      x.axis.title = ifelse(!is.null(xlabel), TRUE, FALSE)
    ) +
    xlab(ifelse(!is.null(xlabel), xlabel, "")) +
    scale_color_manual(values = colors) +
    thlYaxisControl(
      lang = lang,
      limits = ylimits,
      breaks = yaxis.breaks,
      sec.axis = labels.end,
      sec.axis.breaks = brks,
      sec.axis.labels = labsut
    )
  if (is.factor(data[, deparse(substitute(xvar))]) ||
      is.character(data[, deparse(substitute(xvar))]) ||
      is.logical(data[, deparse(substitute(xvar))])) {
    gg <- gg + scale_x_discrete(breaks = xaxis.breaks, expand = expand_scale(mult = c(0.05)))
  } else (gg <- gg + scale_x_continuous(breaks = xaxis.breaks))
  if (isTRUE(panels)) {
    fmla <- as.formula(paste0("~", substitute(groupvar)))
    gg <- gg + facet_wrap(fmla, scales = "free", nrow = nrow.panels)
  }
  gg
}
# Nro;Alue;SP;Alue;Ik„ (a);TEQ;TapVer;Tequart;Ik„luokka;Altistus;Valittu tapaus;Stratum2;Valittuja verrokkeja;Tapauksen ik„;;;L”ytynyt tapaus;L”ytyneiden m„„r„;Hakuprosessi: 1) Varmista, ett„ sarake Valittu tapaus on tyhj„. 2) Anna valittujen m„„r„ksi 0 ja ik„kriteetiksi tiukin k„ytetty. 3). Laske. Filter”i m„„r„ 1:t ja merkitse l”ytynyt tapauksen tunnus sarakkeeseen Valittu tapaus. 4) Laske. Filter”i L”ytyneiden m„„r„t 2, 3, jne ja valitse tapaus oikealle verrokille. 5) L”ys„„ ik„kriteeri„ jos on tarpeen ja toista 3)-4). 6) Anna valittujen m„„r„ksi 1, 2, 3 jne ja toista 2) - 5).
# Z = helfs01.thl.fi/documents/
sarc <- read.csv("Z:/YMAL_arc/CEHRA_Archived2018/Tutkimus/_until2004/R16_sarkooma/Analyysit/Analyysi/Lopulliset4.csv",
                skip=2, sep=";", dec=",", header=FALSE)
sar <- sarc[c(1:3, 5:460),c(2,3,5,6,7)]
colnames(sar) <- c("Region","Gender","Age","TEQ","Case")
sar$Gender <- factor(sar$Gender, labels=c("Male","Female"))
sar$Case <- factor(sar$Case, labels=c("Case","Control"))
ggplot(sar, aes(x=Age, y=TEQ, colour=Gender))+geom_point()
thlPointPlot(sar, xvar=Age, yvar=TEQ, groupvar=Gender, marked.treshold = 1000,
            legend.position = "bottom",
            xlabel="Age", ylabel="", base.size=30,
            title="Dioxin concentration by age",
            subtitle="(pg/g TEQ in fat)")+
  geom_vline(xintercept=0, width=1.5)
ggsave("Dioxin concentration.png", width=11, height=8)
</rcode>


==See also==
==See also==

Latest revision as of 18:07, 1 August 2019



Question

Because it is obvious that there is a great need for improved exposure assessment in studying cancer risk of dioxins, we decided to undertake the major effort of conducting a large case-control study on soft-tissue sarcoma and measure dioxin concentrations individually in both patients and controls. Because this can be done accurately only from very large blood samples or from fat samples taken during an operation, we studied STS patients coming to surgery because of their tumor and selected appendicitis patients as controls. In the general population, the exposure to dioxins is almost totally from dietary sources — in Finland mostly from fish — and it varies widely among the population. Because of the extremely long half-life of dioxins, measured levels of dioxin at the time of operation can be used to estimate the lifetime cumu- lative exposure accurately. There is a priori no simultaneous exposure to chlorophenols or phenoxy acid herbicides, which behave completely differently in the environment, have relatively short half-lives in humans and are excreted in a few days. This enables us to estimate the association of STS with clean dioxin exposure without concomitant exposure to the main chemical, in contrast to occupational studies.[1]

Answer

There is simulated data available about the study. For details, see #Simulated data.


Main fish consumption and PCDD/F variables

Some plots about dioxin congeners.

What congener do you want to plot on X axis?:

What congener do you want to plot on X axis?:

+ Show code

Rationale

Methods

Study population

The majority of sarcoma patients in southern Finland are treated by the multidisciplinary sarcoma group of Helsinki University Central Hospital, with the remaining cases in the University Hospitals of Kuopio, Turku, or Tampere. All patients referred to these hospitals for operative treatment of STS between June 1997 (August 1996 in Helsinki) and December 1999 and more than 15 years of age were eligible as cases. The diagnoses were verified histologically for all except 7 patients. Sarcomas connected with known familial or genetic conditions, as well as sarcomas arising in visceral organs and bone, were excluded. Also other malignancies than STS, as well as nonmalignant tumors, were rejected. Some patients were operated twice during the study period; the second sample was not processed.

All patients who were operated due to an appendicitis diagnosis in a study hospital and who were more than 15 years of age were eligible as controls. They were collected from the same catchment area as the STS patients by dividing it into 15 areas (mainly according to former Finnish health care districts). One hospital performing appendectomy operations was recruited to the study from each area (in Helsinki, 2 hospitals). These were the university, central, or district hospitals of Helsinki, Hyvinka¨a¨, Ha¨meenlinna, Joensuu, Jyva¨skyla¨, Kotka, Kuopio, Lahti, Lappeenranta, Pori, Seina¨joki, Tampere, Turku and Vaasa, and the municipality hospitals in Espoo (Jorvi Hospital) and Helsinki (Maria Hospital). Informed consent was obtained from all patients in writing before the operation. The study was approved by the ethics committees of the National Public Health Institute and the hospitals involved.

The total number of patients recruited during the fieldwork was 972. One case was deleted due to missing address information, 1 case and 2 controls due to missing age information, and 3 cases and 11 controls since their fat samples were too small for dioxin analysis. As a result, we had 954 patients (148 cases and 806 controls) available for matching. The age range was 17.0 –91.1 years for cases and 15.0–88.7 years for controls. Based on National Cancer Registry data, we caught 70%, 9%, 17% and 26% of STS patients in Helsinki, Turku, Tampere and Kuopio University hospital regions, respectively, during the study period (calendar years 1997–1999). In Helsinki, all patients treated surgically with correct diagnosis were caught and agreed to participate; those not caught were either treated nonsurgically or misdiagnosed. Based on hospital discharge registry data, we estimate that about onefourth of appendicitis patients were caught in average during the most active collection period, but differences between hospitals were large.

The cases and controls were individually matched for area and age at the end of the fieldwork. This was done to ensure that there are enough controls from small areas and old age groups in the final data set, as it was not possible to analyze all recruited patients for dioxin. Area was defined based on the area of residence using the 15 areas described above. The age was determined at the day of operation. Maximum allowed difference in age between cases and controls was ± 3 years if case was < 38.0 years old, and ± 6 years if case was >= 38.0 years old. The control closest by age was matched to the case. Cases with fewer controls had a priority over cases with more controls. The number of controls per case was limited to 3. For 110 cases, 227 matching controls could be found in the pool. Thirty-nine cases had 1 control, 25 cases had 2 and 46 cases had 3 controls; for 38 cases, no control matching both age and area could be found.

Exposure assessment

From the matched 337 patients, concentrations of the 17 toxic polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDD/Fs) were measured from a subcutaneous fat sample obtained during an appendectomy or sarcoma operation. Measurements were done by gas chromatography-mass spectrometry30 at the Laboratory of Chemistry, which is an accredited testing laboratory (T077) for the analysis of dioxins in human samples (current standard: EN ISO/ IEC 17025) and has successfully participated in WHO/Euro intercalibrations. The concentrations were summed up after the value of each congener was multiplied by its relative toxic potency (toxic equivalency factor, TEF). The TEF values according to WHO31 were used, resulting in toxic equivalent concentrations (WHOTEq). Fat samples were analyzed during and after the collection period. Samples from STS patients were always analyzed in a batch containing also samples from appendicitis patients. All analytical work was performed blind so that the chemistry laboratory did not know the diagnosis of the patient. Quality assurance of analysis was performed with 2 separate means: 2 preformulated pools of human fat with different concentrations of dioxins [10.6 (n = 35) and 40.2 (n = 33) ng/kg (WHO-TEq in fat)] were always run with each lot of samples, and 36 individual fat samples with WHO-TEqs ranging from 6.9 to 116 ng/kg fat were analyzed as duplicates. The coefficients of variation for WHO-TEq in preformulated pools were 5.1% and 5.7%, respectively, and in duplicate analysis, 6.2%.[2]

A detailed questionnaire about socioeconomic and lifestyle factors and chemical exposures was given to the patients in the hospital. If the patient was found not to have received the questionnaire in the hospital or if the patient did not return it, a new copy was sent to the patient’s home address. Of the matched subjects, 84 cases (76%) and 185 controls (81%) have also questionnaire information.

Detailed exposure assessment

The concentrations of the 17 toxic PCDD/F congeners and of the 36 PCB congeners were measured from fat of a subcutaneous tissue sample (0.3–1.5 g of fat) which was obtained during an appendectomy or sarcoma operation. The toxic equivalents (WHOPCDD/F-TEQ and WHOPCB-TEQ) were calculated with the sets of toxic equivalency factors (TEF), recommended by WHO in 1998 (Van den Berg et al., 1998).

Fat from tissue sample was extracted with toluene for 18–24 h using the Soxhlet apparatus. The fat content was determined gravimetrically after changing the solvent to hexane using nonane as a keeper. Fat sample was spiked with a set of 13C-labeled internal standards: sixteen 2,3,7,8-chlorinated PCDD/F congeners, three non-ortho PCBs (PCB 77, 126, 169), and nine other PCBs (PCB 30 [12C-labeled], 80, 101, 105, 138, 153, 156, 180, 194).

The sample was defatted in a silica gel column containing acidic and neutral layers of silica, and all analytes were eluted with dichloromethane (DCM):cyclohexane (c-hexane) (1:1). PCDD/Fs were separated from PCBs on activated carbon column (Carbopack C, 60/80 mesh) containing Celite (Merck 2693). The first fraction including PCBs was eluted with DCM:c-hexane (1:1) following a back elution of the second fraction (PCDD/Fs) with toluene. Eluents from both of the fractions were evaporated using nonane as a keeper and then fractions in n-hexane were further cleaned by passing them through an activated alumina column (Merck 1097). The PCDD/F fraction was eluted from the alumina column with 20% DCM in n-hexane and recovery standards (13C 1,2,3,4-TCDD and 13C 1,2,3,7,8,9-HxCDD) were added to the fraction before DCM and n-hexane were replaced by 10-15 μl of nonane. The PCB fraction was eluted from the alumina column with 2% DCM in n-hexane, and the fraction, after changing the eluent to n-hexane, was transferred to another activated carbon column (without Celite) in order to separate the non-ortho PCBs from other PCBs. DCM (50%) in n-hexane was used to elute other PCBs while non-ortho PCBs were back eluted with toluene. Recovery standards, PCB 159 for other PCBs and 13C PCB 60 for non-ortho PCBs were added prior to analysis; the solvent for other PCBs (DCM:n-hexane, 1:1) was replaced by 300 μl of n-hexane, for non-ortho PCBs toluene was replaced by 10–15 μl of nonane. The quantitation was performed by selective ion recording mode using a VG 70–250 SE (VG Analytical, UK) mass spectrometer (resolution 10,000) equipped with a HP 6890 gas chromatograph with a fused silica capillary column (DB-DIOXIN, 60 m, 0.25 mm, 0.15 μm). Two μl were injected into a split-splitless injector at 270 °C. The temperature programs for PCDD/Fs, non-ortho-PCBs, and other PCBs were:

  • start, 140 °C (4 min), rate 20 °C min−1 to 180 °C (0 min), rate 2 °C min−1 to 270 °C (36 min);
  • start, 140 °C (4 min), rate 20 °C min−1 to 200 °C (0 min), rate 10 °C min−1 to 270 °C (12 min);
  • start, 60 °C (3 min), rate 20 °C min−1 to 200 °C (0 min), rate 4 °C min−1 to 270 °C (14 min); respectively.

Limits of quantitation (LOQ) for PCDD/Fs and non-ortho PCBs varied between 0.1–5 and 1–5 pg g−1 fat, respectively, and for other PCBs between 0.02 and 0.1 ng g−1 fat, depending on each individual congener. Recoveries for internal standards were more than 50% for all congeners. Concentrations were calculated with lower bound method in which the results of congeners with concentrations below the LOQ were designated as nil.

This code was used to upload the data to Opasnet Base:

+ Show code


Quality control and assurance

Fat samples were analyzed during and after the collection period 1997–1999. All analytical work was performed blind such that the chemistry laboratory knew only the code of the sample. The laboratory reagent and equipment blank samples were treated and analyzed with the same method as the actual samples, one blank for every eight to ten samples. Quality assurance of analysis was performed in two separate ways: (a) two preformulated pools of human fat with different concentrations of PCDD/Fs [10.6 (n = 35) and 40.2 (n = 33) pg g−1 (WHOPCDD/F-TEQ in fat)] and PCBs [4.72 and 24.2 pg g−1 (WHOPCB-TEQ), respectively] were always run with each lot of samples and (b) 36 individual fat samples with WHOPCDD/F-TEQs ranging from 6.9 to 116 pg g−1 and WHOPCB-TEQs from 4.6 to 95 pg g−1 were analyzed in duplicate. The coefficients of variation (CV) for WHOPCDD/F-TEQ in preformulated pools were 5.1% and 5.7%, respectively, and for WHOPCB-TEQ 12 and 9.0%, respectively. In duplicate analysis the CV was 6.2% for WHOPCDD/F-TEQ and 18% for WHOPCB-TEQ.

The laboratory has successfully participated in several international quality control studies for the analysis of PCDD/Fs, and PCBs. Matrices in these studies have included cow milk, human milk and human serum. (Yrjänheikki, 1991, Rymen, 1994, WHO, 1996 and Lindström et al., 2000). The laboratory of chemistry in the National Public Health Institute is an accredited testing laboratory (No T077) in Finland (EN ISO/IEC 17025). The scope of accreditation includes PCDD/Fs, non-ortho PCBs, and other PCBs from human tissue samples.

Statistical analyses

Conditional logistic regression analysis was performed with SAS PHREG procedure. Odds ratios were estimated for each quintile of WHO-TEq, the sum of the toxic congeners and the most relevant individual congeners, i.e., 2378-TCDD, 2378-TCDF, 12378-PeCDD, 23478-PeCDF and 123678-HxCDD (abbreviations: T, tetra; Pe, penta; Hx, hexa; Hp, hepta; O, octa; CDD, chlorinated dibenzo-p-dioxin; CDF, chlorinated dibenzofuran). In the other congener-specific analyses, exposures were treated as continuous variables and odds ratios were calculated for an increase of an interquartile range of the exposure.

All analyses were adjusted for sex. Several variables collected with the questionnaire were used as confounders in the analysis one by one. Nonbinary variables were analyzed as quartiles. Radiation therapy given to an STS patient was considered as diseaserelated and ignored in the analyses if the link to the disease was stated in the questionnaire or if the therapy had been given within 1 year before the operation. The analysis with the largest number of missing values was that with education years with 63 cases and 112 controls, but otherwise there were at least 70 cases and 125 controls in the analyses.

Fish consumption was studied in detail. Specific questions about the frequency of fish consumption were asked: 1 about total fish consumption, and 10 about specific types of fish or fish species. Four fish types contributed most to the total fish consumption. They were assumed to have high (Baltic herring, Baltic salmon) or low (predatory fish from lakes, rainbow trout) dioxin concentration based on previous results.[3] The consumption frequencies (times per month) were calculated for high- and low-dioxin fish separately based on these 4 fish types. Exposure to the following chemicals was asked as a binary variable: solvents, solvent-based paints, formaldehyde, insecticides, fungicides/herbicides, wood preservatives, strong detergents, heavy metals, other chemicals.

Data

The code below runs the main fish consumption and PCDD/F variables, but because this is personal-level data, you need a password to run it. However, you can see ready-made results [1].

For variable descriptions, see D↷

Password:

+ Show code

Questionnaire

Variable information

The variable information was originally documented in Log file about the statistical analyses: Part 1, but unfortunately mostly in Finnish.



Data management

Code to manage the data. It takes the original data files and merges them. Works only if files are available.



Interpretations

The consumption of hard fat is calculated in the following way (Q## means the value from the survey question; I## means the interpretation from the table below; Q24&I24 means that question Q24 is quantified by using interpretation from I24 with matching values; Q23*I23 means that the survey value and interpretation are multiplied.

total_fat = (Q23a*I23 + Q23b*I23) * Q24&I24 + Q25&I25 * Q26&I26 * Q21a&I21 + Q27&I27 * 20

The code assumes that a person uses 20 g/d fat for cooking. Q23: how much a) milk, b) sourmilk; Q24: What kind of milk; Q25 what fat on bread; Q26: how much fat on bread; Q27: what fat for cooking.

The following assumptions are used to interpret survey answers:

Assumptions for calculations(-)
ObsVariableValueUnitResultDescriptionVastaus suomeksi
1Q23dl per glass2Size of a glass of milk or sourmilk
2Q241fat g/dl0.035full milk, fat g/dltäysmaitoa
3Q242fat g/dl0.015light milk, fat g/dlkevytmaitoa
4Q243fat g/dl0.011% milk, fat g/dlykkösmaitoa
5Q244fat g/dl0fat-free milkrasvatonta maitoa
6Q245fat g/dl0fat-free sourmilkrasvatonta piimää tai kirnupiimää
7Q246fat g/dl0.01other sourmilk fat g/dlmuuta piimää
8Q247fat g/dl0none of theseen juo maitoa enkä piimää
9Q251hard fat, proportion0noneen mitään
10Q252hard fat, proportion0.15soft margarine, share of hard fatkasvimargariinia
11Q253hard fat, proportion0.5oil-butter-mix, share of hard fatVoi-kasvirasvaseosta
12Q254hard fat, proportion1buttervoita
13Q261fat g /slice of bread00 g per slice of breaden lainkaan
14Q262fat g /slice of bread33 g per slice of bread10 g per 3 viipaletta
15Q263fat g /slice of bread77 g per slice of bread10 g per 1-2 viipaletta
16Q264fat g /slice of bread1515 g per slice of breadYli 10 g per viipale
17Q271hard fat fraction0hard fat fraction in the baking fat usedkasviöljyä
18Q272hard fat fraction0.15hard fat fraction in the baking fat usedkasvimargariinia
19Q273hard fat fraction0.5hard fat fraction in the baking fat usedtalousmargariinia
20Q274hard fat fraction0.5hard fat fraction in the baking fat usedVoi-kasvirasvaseosta
21Q275hard fat fraction1hard fat fraction in the baking fat usedvoita
22Q276hard fat fraction0hard fat fraction in the baking fat usedei mitään rasvaa
23Q351alcohol times /a300päivittäin
24Q352alcohol times /a100muutaman kerran viikossa
25Q353alcohol times /a50noin kerran viikossa
26Q354alcohol times /a25pari kertaa kuukaudessa
27Q355alcohol times /a12noin kerran kuukaudessa
28Q356alcohol times /a6noin kerran parissa kuukaudessa
29Q357alcohol times /a43-4 kertaa vuodessa
30Q358alcohol times /a2pari kertaa vuodessa
31Q359alcohol times /a1kerran vuodessa tai harvemmin
32Q3510alcohol times /a0en koskaan
33Q361alcohol portion 0g alcoholvähemmän kuin yhden
34Q362alcohol portion 12g alcohol1 annoksen
35Q363alcohol portion 24g alcohol2 annosta
36Q364alcohol portion 36g alcohol3 annosta
37Q365alcohol portion 55g alcohol4-5 annosta
38Q366alcohol portion 96g alcohol6-10 annosta
39Q367alcohol portion 150g alcoholYli 10 annosta
40Q21a1g/day carbohydrates1.5carbohydrates per day of 100 g bread slicesleipää 100 g viipaleina. oletus: 50% hiilihydraattia
41Q21a2g/day carbohydrates2.5carbohydrates per day of 100 g bread slicesleipää 100 g viipaleina.
42Q21a3g/day carbohydrates7.5carbohydrates per day of 100 g bread slicesleipää 100 g viipaleina.
43Q21a4g/day carbohydrates15carbohydrates per day of 100 g bread slicesleipää 100 g viipaleina.
44Q21a5g/day carbohydrates50carbohydrates per day of 100 g bread slicesleipää 100 g viipaleina.
45Q21a6g/day carbohydrates100carbohydrates per day of 100 g bread slicesleipää 100 g viipaleina.
46Q21b1g/day carbohydrates0.84carbohydrates per day of 200 g porridgepuuroa 200 g annoksina. oletus: 70% hiilihydraattia viljasta, jota 20%
47Q21b2g/day carbohydrates1.4carbohydrates per day of 200 g porridgepuuroa 200 g annoksina.
48Q21b3g/day carbohydrates4.2carbohydrates per day of 200 g porridgepuuroa 200 g annoksina.
49Q21b4g/day carbohydrates8.4carbohydrates per day of 200 g porridgepuuroa 200 g annoksina.
50Q21b5g/day carbohydrates28carbohydrates per day of 200 g porridgepuuroa 200 g annoksina.
51Q21b6g/day carbohydrates56carbohydrates per day of 200 g porridgepuuroa 200 g annoksina.
52Q21c1g/day carbohydrates1.2carbohydrates per day of 200 g pastapastaa 200 g annoksina. oletus: 80% hiilihydraattia viljasta, jota 25%
53Q21c2g/day carbohydrates2carbohydrates per day of 200 g pastapastaa 200 g annoksina.
54Q21c3g/day carbohydrates6carbohydrates per day of 200 g pastapastaa 200 g annoksina.
55Q21c4g/day carbohydrates12carbohydrates per day of 200 g pastapastaa 200 g annoksina.
56Q21c5g/day carbohydrates40carbohydrates per day of 200 g pastapastaa 200 g annoksina.
57Q21c6g/day carbohydrates80carbohydrates per day of 200 g pastapastaa 200 g annoksina.
58Q21d1g/day carbohydrates1.26carbohydrates per day of 200 g musli etcmuita (mysli ym). oletus: 70% hiilihydraattia viljasta, jota 30%
59Q21d2g/day carbohydrates2.1carbohydrates per day of 200 g musli etcmuita (mysli ym).
60Q21d3g/day carbohydrates6.3carbohydrates per day of 200 g musli etcmuita (mysli ym).
61Q21d4g/day carbohydrates12.6carbohydrates per day of 200 g musli etcmuita (mysli ym).
62Q21d5g/day carbohydrates42carbohydrates per day of 200 g musli etcmuita (mysli ym).
63Q21d6g/day carbohydrates84carbohydrates per day of 200 g musli etcmuita (mysli ym).
64Q21e1g/day carbohydrates0.3carbohydrates per day of 200 g youghurt etcviiliä tai jugurttia, sokeri. oletus: 5% hiilihydraattia (Doc. Geigy s. 479)
65Q21e2g/day carbohydrates0.5carbohydrates per day of 200 g youghurt etcviiliä tai jugurttia, sokeri.
66Q21e3g/day carbohydrates1.5carbohydrates per day of 200 g youghurt etcviiliä tai jugurttia, sokeri.
67Q21e4g/day carbohydrates3carbohydrates per day of 200 g youghurt etcviiliä tai jugurttia, sokeri.
68Q21e5g/day carbohydrates10carbohydrates per day of 200 g youghurt etcviiliä tai jugurttia, sokeri.
69Q21e6g/day carbohydrates20carbohydrates per day of 200 g youghurt etcviiliä tai jugurttia, sokeri.
70Q21f1g/day carbohydrates0.015carbohydrates per 50 g cheesevähärasv. juusto, sokeri.
71Q21f2g/day carbohydrates0.025carbohydrates per 50 g cheesevähärasv. juusto, sokeri. oletus: 1% hiilihydraattia (Doc. Geigy s. 479)
72Q21f3g/day carbohydrates0.075carbohydrates per 50 g cheesevähärasv. juusto, sokeri.
73Q21f4g/day carbohydrates0.15carbohydrates per 50 g cheesevähärasv. juusto, sokeri.
74Q21f5g/day carbohydrates0.5carbohydrates per 50 g cheesevähärasv. juusto, sokeri.
75Q21f6g/day carbohydrates1carbohydrates per 50 g cheesevähärasv. juusto, sokeri.
76Q21g1g/day carbohydrates0.015carbohydrates per 50 g cheesemuu juusto, sokeri. oletus: 1% hiilihydraattia (Doc. Geigy s. 479)
77Q21g2g/day carbohydrates0.025carbohydrates per 50 g cheesemuu juusto, sokeri.
78Q21g3g/day carbohydrates0.075carbohydrates per 50 g cheesemuu juusto, sokeri.
79Q21g4g/day carbohydrates0.15carbohydrates per 50 g cheesemuu juusto, sokeri.
80Q21g5g/day carbohydrates0.5carbohydrates per 50 g cheesemuu juusto, sokeri.
81Q21g6g/day carbohydrates1carbohydrates per 50 g cheesemuu juusto, sokeri.
82Q21h1g/day carbohydrates0.3carbohydrates per 100 g ice creamjäätelöä. oletus: 10% hiilihydraattia
83Q21h2g/day carbohydrates0.5carbohydrates per 100 g ice creamjäätelöä.
84Q21h3g/day carbohydrates1.5carbohydrates per 100 g ice creamjäätelöä.
85Q21h4g/day carbohydrates3carbohydrates per 100 g ice creamjäätelöä.
86Q21h5g/day carbohydrates10carbohydrates per 100 g ice creamjäätelöä.
87Q21h6g/day carbohydrates20carbohydrates per 100 g ice creamjäätelöä.
88Q21i1g/day hard fat0.12hard fat per 200 g youghurt etcviiliä tai jugurttia, rasva. oletus: 2 % rasvaa
89Q21i2g/day hard fat0.2hard fat per 200 g youghurt etcviiliä tai jugurttia, rasva.
90Q21i3g/day hard fat0.6hard fat per 200 g youghurt etcviiliä tai jugurttia, rasva.
91Q21i4g/day hard fat1.2hard fat per 200 g youghurt etcviiliä tai jugurttia, rasva.
92Q21i5g/day hard fat4hard fat per 200 g youghurt etcviiliä tai jugurttia, rasva.
93Q21i6g/day hard fat8hard fat per 200 g youghurt etcviiliä tai jugurttia, rasva.
94Q21j1g/day hard fat0.15hard fat per 50 g low-fat cheesevähärasvainen juusto. oletus: 10% rasvaa
95Q21j2g/day hard fat0.25hard fat per 50 g low-fat cheesevähärasvainen juusto.
96Q21j3g/day hard fat0.75hard fat per 50 g low-fat cheesevähärasvainen juusto.
97Q21j4g/day hard fat1.5hard fat per 50 g low-fat cheesevähärasvainen juusto.
98Q21j5g/day hard fat5hard fat per 50 g low-fat cheesevähärasvainen juusto.
99Q21j6g/day hard fat10hard fat per 50 g low-fat cheesevähärasvainen juusto.
100Q21k1g/day hard fat0.45hard fat per 50 g cheesejuusto. oletus: 30% rasvaa (fineli)
101Q21k2g/day hard fat0.75hard fat per 50 g cheesejuusto.
102Q21k3g/day hard fat2.25hard fat per 50 g cheesejuusto.
103Q21k4g/day hard fat4.5hard fat per 50 g cheesejuusto.
104Q21k5g/day hard fat15hard fat per 50 g cheesejuusto.
105Q21k6g/day hard fat30hard fat per 50 g cheesejuusto.
106Q21l1g/day hard fat0.3hard fat per 100 g ice creamjäätelöä. oletus: 10% rasvaa
107Q21l2g/day hard fat0.5hard fat per 100 g ice creamjäätelöä.
108Q21l3g/day hard fat1.5hard fat per 100 g ice creamjäätelöä.
109Q21l4g/day hard fat3hard fat per 100 g ice creamjäätelöä.
110Q21l5g/day hard fat10hard fat per 100 g ice creamjäätelöä.
111Q21l6g/day hard fat20hard fat per 100 g ice creamjäätelöä.
112Q21m1g/day hard fat0.45hard fat per 100 g meat liharuokaa. oletus: 15% rasvaa (Doc. Geigy s. 481)
113Q21m2g/day hard fat0.75hard fat per 100 g meat liharuokaa.
114Q21m3g/day hard fat2.25hard fat per 100 g meat liharuokaa.
115Q21m4g/day hard fat4.5hard fat per 100 g meat liharuokaa.
116Q21m5g/day hard fat15hard fat per 100 g meat liharuokaa.
117Q21m6g/day hard fat30hard fat per 100 g meat liharuokaa.
118Q21n1g/day hard fat0.15hard fat per 100 g meat kalaruokaa. oletus: 5% kovaa rasvaa, finelin mukaan 2-5%
119Q21n2g/day hard fat0.25hard fat per 100 g meat kalaruokaa.
120Q21n3g/day hard fat0.75hard fat per 100 g meat kalaruokaa.
121Q21n4g/day hard fat1.5hard fat per 100 g meat kalaruokaa.
122Q21n5g/day hard fat5hard fat per 100 g meat kalaruokaa.
123Q21n6g/day hard fat10hard fat per 100 g meat kalaruokaa.

How much mass, energy, and dioxin does one portion contain? Data are guesswork of from Fineli.

Food energy and dioxin(g,kJ,pg/portion)
ObsFoodMassEnergyDioxin
1Kalaa1006007
2Silakkaa100792470
3Petokalaa10030125
4Muikkua10075028
5Sisävesikalaa10066823
6Kirjolohta100106774
7Itämeren.lohta1001067770
8Muuta.Itämerestä10066815
9Pakastekalaa1003247
10Kalasäilykkeitä606007
11Valtamerikalaa1006007
12Äyriäisiä602007
13Leipää504060.01
14Puuroja2006420.02
15Makaronia2008460.02
16Muutaviljaa1506000.02
17Viiliä2003340.008
18Juustoja403000.012
19Rasvaisia.juustoja406000.03
20Jäätelöä15012000.03
21Liharuokaa15014001.5
22Maitoa2003580.004
23Piimää2003580.004
Portions per month(portions/mo)
ObsAnswerInterpretation
1En lainkaan0.003
2Harvemmin kuin kerran kuukaudessa tai en lainkaan0.1
3Harvemmin kuin kerran kuukaudessa0.5
4Kerran tai pari kuukaudessa1.5
5Kerran viikossa4
6Pari kertaa viikossa8
7Lähes joka päivä20
8Kerran päivässä tai useammin40

Analyses

Simulated data

This code was used to create a csv file that contains a simulated data from this study. When compared with the original data, the simulated data
  • has the same number of observations,
  • has the same range of values in each variable,
  • has approximately the same correlation structure between all variables.

+ Show code

POPs and obesity

Dioxins and PCBs have been assosiated to type 2 diabetes. Do dioxins cause diabetes, or do diabetes decrease dioxin elimination, or does obesity increase diabetes and decrease dioxin elimination, or something else? We tried to make sense of this by looking at sarcoma study data.

+ Show code

Self-reported chemical exposure

We looked at self-reported chemical exposure, especially pesticides and wood preservatives.

We also looked at the impact of self-reported occupation, recoded into 9 groups. This is best done in the unmatched dataset, but also some analyses were done with the matched dataset. Age was the only clearly significant variable, with sarcoma risk increasing by 8 % per year. Male gender seemed to increase the risk but was not statistically significant. None of the differences between occupation groups were statistically significant, and they did not show a pattern where putatively chemically-exposured groups would have higher risk.

+ Show code

Correlation of dioxin and fish

How do individual dioxin congeners correlate with individual fish parametres in the questionnaire?

+ Show code

These estimates are based on the code above.

Binomial distribution parameter(probability)
ObsFishParameter
1Kalaa0.23179078
2Petokalaa0.17746642
3Muikkua0.14939457
4Sisävesikalaa0.09493785
5Kirjolohta0.29775961
6Silakkaa0.2152247
7Itämeren.lohta0.08175282
8Muuta.Itämerestä0.0282421
9Pakastekalaa0.21510166
10Kalasäilykkeitä0.21598007
11Valtamerikalaa0.1078198
12Äyriäisiä0.16021416

Correlation coefficients between fish dishes



EU kalat

  • The code that used to be here was moved to EU-kalat#Calculations.
  • What updates should be done:
    • Plot iterations to see that the model results do not drift.
    • Take modelled parameters and develop a MC model to produce predicted concentrations.
      • TCDD concentration should be added to the hierearchical Bayes model for this?
    • KTL Sarcoma study, EU-kalat and Goherr: Fish consumption study should all be combined into one model. ----#: . Can models be combined as text with paste()? This could work if all submodels had unique parameter names like N.eu and N.goh rather than just N. And data lists are merged simply with c(). --Jouni (talk) 16:22, 22 January 2017 (UTC) (type: truth; paradigms: science: comment)
    • A causal diagram should be drawn to show the model structure.
  • JAGS user manual [2] (with e.g. distribution names and other guidance)
  • How to generate predictions in JAGS [3]
  • Using rjags, a simple guidance [4]

Related:

  • Easily generate correlated variables from any distribution (without copulas) [5]

Concentration-age graph with THL formatting

+ Show code

See also

Related files

References

  1. Jouni T. TUOMISTO, Juha PEKKANEN, Hannu KIVIRANTA, Erkki TUKIAINEN, Terttu VARTIAINEN and Jouko TUOMISTO. Soft-tissue sarcoma and dioxin: a case-control study. Int. J. Cancer: 108, 893–900 (2004)
  2. Chemosphere (2005) 60: 78: 854-869
  3. Kiviranta H, Korhonen M, Hallikainen A, Vartiainen T. Kalojen dioksiinien ja PCB:eiden kulkeutuminen ihmiseen. Ympäristö ja Terveys 2000; 31: 65-9.