Goherr: Fish consumption study
[show] This page is a study.
The page identifier is Op_en7749 |
---|
Moderator:Arja (see all) |
This page is a stub. You may improve it into a full page. |
Upload data
|
Contents
Question
How Baltic herring and salmon are used as human food in Baltic sea countries? Which determinants affect on people’s eating habits of these fish species?
Answer
Original questionnaire analysis results
- 13.3.2017 ----#: . These should be presented somewhere --Arja (talk) 07:39, 26 April 2017 (UTC) (type: truth; paradigms: science: comment)
Consumption amount estimates
- Model run 21.4.2017 [1] first distribution
- Model run 18.5.2017 with modelled data; with direct survey data
# This is code Op_en7749/ on page [[Goherr: Fish consumption study#Answer]] library(OpasnetUtils) library(ggplot2) objects.latest("Op_en7749", code_name="initiate") # [[Goherr: Fish consumption study]] ovariables if(usesurvey) { objects.latest("Op_en7749", code_name="surveyjsp") # jsp ovariable directly based on survey data (N=2217) openv.setN(nrow(jsp@data)) } amount <- EvalOutput(amount) if(usesurvey) { oprint(summary(amount, marginals=c("Gender", "Country", "Fish","Ages"))) print(ggplot(amount@output, aes(x=amountResult+0.1, colour=Country))+stat_ecdf()+scale_x_log10()+facet_wrap(~ Fish)+ labs(x="Fish consumption (g /d)", y="Cumulative frequency")+theme_gray(base_size=24)) print(ggplot(amount@output, aes(x=amountResult+0.1, colour=Ages))+stat_ecdf()+scale_x_log10()+facet_grid(Country ~ Fish)+ labs(x="Fish consumption (g /d)", y="Cumulative frequency")+theme_gray(base_size=24)) print(ggplot(amount@output, aes(x=amountResult+0.1, colour=Gender))+stat_ecdf()+scale_x_log10()+facet_grid(Country ~ Fish)+ labs(x="Fish consumption (g /d)", y="Cumulative frequency")+theme_gray(base_size=24)) print(ggplot(often@output, aes(x=oftenResult+0.1, colour=Country))+stat_ecdf()+scale_x_log10()+facet_wrap(~ Fish)) print(ggplot(oftenside@output, aes(x=oftensideResult+0.1, colour=Country))+stat_ecdf()+scale_x_log10()+facet_wrap(~ Fish)) print(ggplot(much@output, aes(x=muchResult+0.1, colour=Country))+stat_ecdf()+scale_x_log10()+facet_wrap(~ Fish)) print(ggplot(muchside@output, aes(x=muchsideResult+0.1, colour=Country))+stat_ecdf()+scale_x_log10()+facet_wrap(~ Fish)) } else { oprint(summary(amount, marginals=c("Fish"))) print(ggplot(amount@output, aes(x=amountResult+0.1, colour=Fish))+stat_ecdf()+scale_x_log10()+ labs(x="Fish consumption (g /d)", y="Cumulative frequency")+theme_gray(base_size=24)) print(ggplot(often@output, aes(x=oftenResult+0.1, colour=Fish))+stat_ecdf()+scale_x_log10()) print(ggplot(oftenside@output, aes(x=oftensideResult+0.1, colour=Fish))+stat_ecdf()+scale_x_log10()) print(ggplot(much@output, aes(x=muchResult+0.1, colour=Fish))+stat_ecdf()+scale_x_log10()) print(ggplot(muchside@output, aes(x=muchsideResult+0.1, colour=Fish))+stat_ecdf()+scale_x_log10()) } |
Rationale
Survey of eating habits of Baltic herring and salmon in Denmark, Estonia, Finland and Sweden has been done in September 2016 by Taloustutkimus oy. Content of the questionnaire can be accessed in Google drive. The actual data can be found from the link below (see Data).
Data
Questionnaire
Original datafile File:Goherr fish consumption.csv.
[show]Show details | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Assumptions
The following assumptions are used:
Obs | Variable | Value | Unit | Result | Description |
---|---|---|---|---|---|
1 | freq | 1 | times /a | 0 | Never |
2 | freq | 2 | times /a | 0.5 - 0.9 | less than once a year |
3 | freq | 3 | times /a | 2 - 5 | A few times a year |
4 | freq | 4 | times /a | 12 - 36 | 1 - 3 times per month |
5 | freq | 5 | times /a | 52 | once a week |
6 | freq | 6 | times /a | 104 - 208 | 2 - 4 times per week |
7 | freq | 7 | times /a | 260 - 364 | 5 or more times per week |
8 | amdish | 1 | g /serving | 20 - 70 | 1/6 plate or below (50 grams) |
9 | amdish | 2 | g /serving | 70 - 130 | 1/3 plate (100 grams) |
10 | amdish | 3 | g /serving | 120 - 180 | 1/2 plate (150 grams) |
11 | amdish | 4 | g /serving | 170 - 230 | 2/3 plate (200 grams) |
12 | amdish | 5 | g /serving | 220 - 280 | 5/6 plate (250 grams) |
13 | amdish | 6 | g /serving | 270 - 400 | full plate (300 grams) |
14 | amdish | 7 | g /serving | 400 - 550 | overly full plate (500 grams) |
15 | ingredient | fraction | 0.1 - 0.3 | Fraction of fish in the dish | |
16 | amside | 1 | g /serving | 20 - 70 | 1/6 plate or below (50 grams) |
17 | amside | 2 | g /serving | 70 - 130 | 1/4 plate (100 grams) |
18 | amside | 3 | g /serving | 120 - 180 | 1/2 plate (150 grams) |
19 | amside | 4 | g /serving | 170 - 230 | 2/3 plate (200 grams) |
20 | amside | 5 | g /serving | 220 - 280 | 5/6 plate (250 grams) |
21 | change | 1 | fraction | -1 | Decrease it to zero |
22 | change | 2 | fraction | -0.9 - -0.6 | Decrease it to less than half |
23 | change | 3 | fraction | -0.1 - -0.4 | Decrease it a bit |
24 | change | 4 | fraction | 0 | No effect |
25 | change | 5 | fraction | 0.1 - 0.4 | Increase it a bit |
26 | change | 6 | fraction | 0.6 - 0.9 | Increase it over by half |
27 | change | 7 | fraction | 1.1 - 1.3 | Increase it over to double |
28 | change | 8 | fraction | 0 | Don't know |
Preprocessing
This code is used to preprocess the original questionnaire data from the above .csv file and to store the data as a usable variable to Opasnet base. The code stores a data.frame named survey.
- Model run 13.4.2017 [2]
- Model run 20.4.2017 [3] (contains surv and helping vectors)
- Model run 21.4.2017 [4] surv contains Eatfish, Eatherr and Eatsalm as columns.
# This code is Op_en7749/preprocess2 on page [[Goherr: Fish consumption study]] library(OpasnetUtils) objects.latest("Op_en6007", code_name = "answer") # [[OpasnetUtils/Drafts]] webropol.convert, merge.questions ############# Data preprocessing # Get the data either from Opasnet or your own hard drive. #Survey original file: N:/Ymal/Projects/Goherr/WP5/Goherr_fish_consumption.csv survey <- opasnet.csv( "5/57/Goherr_fish_consumption.csv", wiki = "opasnet_en", sep = ";", fill = TRUE, quote = "\"" ) #survey <- re#ad.csv(file = "N:/Ymal/Projects/Goherr/WP5/Goherr_fish_consumption.csv", # header=FALSE, sep=";", fill = TRUE, quote="\"") # Data file is converted to data.frame using levels at row 2121. survey <- webropol.convert(survey, 2121, textmark = ":Other open") # Delete rows that are clearly overestimations survey <- survey[-c(636, 1465, 1865, 1876, 2062, 2088), ] # Take the relevant columnames from the table on the page. colnames(survey) <- gsub(" ", ".", opbase.data("Op_en7749", subset = "Questions in the Goherr questionnaire")$Result[1:ncol(survey)] ) survey$Row <- 1:nrow(survey) survey$Weighting <- as.double(gsub(",",".", survey$Weighting)) survey$Ages <- factor( ifelse(as.numeric(as.character(survey$Age)) < 46, "18-45",">45"), levels = c("18-45", ">45"), ordered = TRUE ) # webropol.convert should put these in the right order but doesn't. So do it manually. freqlist <- c( "less than once a year", "A few times a year", "1 - 3 times per month", "once a week", "2 - 4 times per week", "5 or more times per week" ) amlist <- c( "1/6 plate or below (50 grams)", "1/3 plate (100 grams)", "1/2 plate (150 grams)", "2/3 plate (200 grams)", "5/6 plate (250 grams)", "full plate (300 grams)", "overly full plate (500 grams)" # "Not able to estimate" ) sidel <- c( "1/6 plate or below (50 grams)", "1/4 plate (100 grams)", "1/2 plate (150 grams)", "2/3 plate (200 grams)", "5/6 plate (250 grams)" # "Not able to estimate" ) fishamounts <- c(29,46:49,95:98) colnames(survey)[fishamounts] #[1] "How.often.fish" "How.often.BS" "How.much.BS" "How.often.side.BS" #[5] "How.much.side.BS" "How.often.BH" "How.much.BH" "How.often.side.BH" #[9] "How.much.side.BH" ansl <- list( freqlist, c("Never", freqlist), amlist, c("Never", freqlist), sidel, c("Never", freqlist), amlist, c("Never", freqlist), sidel ) for (i in 1:length(fishamounts)) { survey[[fishamounts[i]]] <- factor(survey[[fishamounts[i]]], levels = ansl[[i]], ordered = TRUE) } oprint(head(survey)) agel <- as.character(unique(survey$Ages)) countryl <- sort(as.character(unique(survey$Country))) genderl <- sort(as.character(unique(survey$Gender))) fisl <- c("Salmon", "Herring") # Interesting fish eating questions surv <- survey[c(1,3,158,16,29,30,31,46:49,75,80,86,95:98, 125, 130)] colnames(surv) #[1] "Country" "Gender" #[3] "Ages" "Fish eating" #[5] "How often eat fish" "Salmon eating" #[7] "Baltic salmon" "How often Baltic salmon" #[9] "How much Baltic salmon" "How often side Baltic salmon" #[11] "How much side Baltic salmon" "Better availability BS" #[13] "Less chemicals BS" "Eat Baltic herring" #[15] "How often Baltic herring" "How much Baltic herring" #[17] "How often side Baltic herring" "How much side Baltic herring" #[19] "Better availability BH" "Less chemicals BH" # For estimating distributions, we should #1 remove people with Fish eating = No (142) #2 merge Eat Baltic herring = I don't know with No (How often BH = NA always) #3 merge Baltic salmon = NA with No (because they usually have answered BH questions) oprint(table(is.na(rowSums(sapply(surv[4:20], as.numeric))))) # BUT: there are so many missing values, that we just model BH and BS separately now. surv <- as.data.frame(lapply(surv, FUN = function(x) as.integer(x))) # Coerce to integers surv[is.na(surv[[14]]) | surv[[14]] == 3 , 14] <- 1 # Eat Baltic herring: I don't know --> No # Row numbers for respondents that have eaten fish, Baltic salmon, and Baltic herring surv$Eatfish <- surv[[4]] %in% 2 surv$Eatsalm <- surv[[7]] %in% 2 & !is.na(rowSums(surv[7:11])) surv$Eatherr <- surv[[14]] %in% 2 & !is.na(rowSums(surv[15:18])) oprint(table(surv[c("Eatsalm", "Eatherr", "Eatfish")], useNA = "ifany")) # Oletetaan, että covarianssimatriisi on vakio kaikille maille ja sukupuolille yms # mutta keskiarvo on spesifi näille ja kysymykselle. #qlen <- c(4,2,2,2,6,2,2,7,7,7,5,2,7,7,7,5) # Number of options in each question of surv # qlen not needed when dbinom is not used. agel countryl genderl fisl objects.store(survey, surv, agel, countryl, genderl, fisl) cat("Data.frames survey and surv, and vectors agel, countryl, genderl and fisl were stored.\n") |
Analyses
Descriptive statistics

Model must contain predictors such as country, gender, age etc. Maybe we should first study what determinants are important? Model must also contain determinants that would increase or decrease fish consumption. This should be conditional on the current consumption. How? Maybe we should look at principal coordinates analysis with all questions to see how they behave.
Also look at correlation table to see clusters.
Some obvious results:
- If reports no fish eating, many subsequent answers are NA.
- No vitamins correlates negatively with vitamin intake.
- Unknown salmon correlates negatively with the types of salmon eaten.
- Different age categories correlate with each other.
However, there are also meaningful negative correlations:
- Country vs allergy
- Country vs Norwegian salmon and Rainbow trout
- Country vs not traditional.
- Country vs recommendation awareness
- Allergy vs economic wellbeing
- Baltic salmon use (4 questions) vs Don't like taste and Not used to
- All questions between Easy to cook ... Traditional dish
Meaningful positive correlations:
- All questions between Baltic salmon ... Rainbow trout
- How often Baltic salmon/herring/side salmon/side herring
- How much Baltic salmon/herring/side salmon/side herring
- Better availability ... Recommendation
- All questions between Economic wellbeing...Personal aims
- Omega3, Vitamin D, and Other vitamins
Model runs
- Model run 13.3.2017
- Model run 21.4.2017 [5] old code from Answer merged to this code and debugged
# This is code Op_en7749/ on page [[Goherr: Fish consumption study]] library(OpasnetUtils) library(ggplot2) library(reshape2) library(car) library(vegan) objects.latest("Op_en7749", "preprocess2") # [[Goherr: Fish consumption study]]: survey, surv ############################### From a previous code on Answer for(i in c(5:6, 16, 29:30, 46:49, 85:86, 95:98, 135)) { temp <- survey[!is.na(survey[[i]]),] p <- ggplot(temp, aes(x = temp[[i]])) + geom_bar() + theme_gray(base_size = 18) + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4)) + labs(title = colnames(temp[i])) + xlab("") + facet_wrap(~ Country) print(p) } temp <- melt(survey, measure.vars = 145:153, variable.name = "Changing.factor", value.name = "Impact") levs <- c("-5 strongly disagree", "-4", "-3", "-2", "-1", "0 Neutral", "1", "2", "3", "4", "5 strongly agree", "I don't know") temp$Impact <- factor(temp$Impact, levels = levs, labels = levs, ordered = TRUE) ggplot(temp, aes(x = Impact, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Changing.factor)+ theme_gray(base_size = 24) + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4)) surveytemp <- subset(survey, survey[[31]] == "Yes") temp <- melt(surveytemp, measure.vars = 38:43, variable.name = "Variable", value.name = "Value") ggplot(temp, aes(x = Value, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Variable)+ theme_gray(base_size = 24) + labs(title = "Baltic salmon sources") temp <- melt(surveytemp, measure.vars = 50:59, variable.name = "Variable", value.name = "Value") ggplot(temp, aes(x = Value, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Variable)+ theme_gray(base_size = 24) + labs(title = "Reasons to eat Baltic salmon") temp <- melt(surveytemp, measure.vars = 73:82, variable.name = "Variable", value.name = "Value") ggplot(temp, aes(x = Value, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Variable)+ theme_gray(base_size = 24) + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4)) + labs(title = "Baltic salmon actions") surveytemp <- subset(survey, survey[[31]] == "No") temp <- melt(surveytemp, measure.vars = 62:70, variable.name = "Variable", value.name = "Value") ggplot(temp, aes(x = Value, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Variable)+ theme_gray(base_size = 24) + labs(title = "Reasons not to eat Baltic salmon") surveytemp <- subset(survey, survey[[86]] == "Yes") temp <- melt(surveytemp, measure.vars = 87:92, variable.name = "Variable", value.name = "Value") ggplot(temp, aes(x = Value, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Variable)+ theme_gray(base_size = 24) + labs(title = "Baltic herring sources") temp <- melt(surveytemp, measure.vars = 99:108, variable.name = "Variable", value.name = "Value") ggplot(temp, aes(x = Value, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Variable)+ theme_gray(base_size = 24) + labs(title = "Reasons to eat Baltic herring") temp <- melt(surveytemp, measure.vars = 123:132, variable.name = "Variable", value.name = "Value") ggplot(temp, aes(x = Value, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Variable)+ theme_gray(base_size = 24) + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4)) + labs(title = "Baltic herring actions") surveytemp <- subset(survey, survey[[86]] == "No") temp <- melt(surveytemp, measure.vars = 111:120, variable.name = "Variable", value.name = "Value") ggplot(temp, aes(x = Value, fill = Country)) + geom_bar(position = "dodge") + facet_wrap(~ Variable)+ theme_gray(base_size = 24) + labs(title = "Reasons not to eat Baltic herring") #####################################3 temp <- sapply(survey, as.numeric) # Can be done for surv to get a smaller matrix survey_correlations <- (cor(temp, method="spearman", use="pairwise.complete.obs")) temp <- colnames(survey_correlations) melted_correlations <- melt(survey_correlations) melted_correlations$Var1 <- factor(melted_correlations$Var1, levels=temp) melted_correlations$Var2 <- factor(melted_correlations$Var2, levels=temp) melted_correlations$value <- ifelse(melted_correlations$value >= 0.99,NA,melted_correlations$value) ggplot(melted_correlations, aes(x = Var1, y = Var2, fill = value, label= round(value, 2)))+ geom_raster()+ theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4))+ scale_fill_gradient2(low = "#480610", mid = "#FFFFFF", high = "#06480F", midpoint = 0, space = "Lab", guide = "colourbar") ####################### Descriptive statistics oprint(cor(surv, use = "pairwise.complete.obs")) # --> Baltic salmon and herring eating are correlated, so they should be estimated together oprint(table(surv[c(12,7,4)], useNA = "ifany")) oprint(table(surv[c(13,12)], useNA = "ifany")) ############################# Plot original data # Eating frequencies of fish and Baltic salmon and herring with random noise, all pl <- surv[surv$Eatfish,c(5,8,10,13,15)] scatterplotMatrix( data.matrix(pl) + runif(nrow(pl)*ncol(pl), -0.5, 0.5) ) ggplot(data.frame( X = rep(1,5), Y = 5:1, legend = c("All", "Finland", "Sweden", "Denmark", "Estonia") ), aes(x=X, y=Y, label=legend))+ geom_text()+ labs(title="Fish eating questions with random noise") ## Fish eating questions with some random noise, all pl <- surv[surv$Eatfish,c(5,6,7,12)] scatterplotMatrix( data.matrix(pl) + runif(nrow(pl)*ncol(pl), -0.5, 0.5) ) ## Fish eating questions with some random noise, FI pl <- surv[surv$Country == 1 & surv$Eatfish,c(5,6,7,12)] scatterplotMatrix( data.matrix(pl) + runif(nrow(pl)*ncol(pl), -0.5, 0.5) ) ## Fish eating questions with some random noise, SWE pl <- surv[surv$Country == 2 & surv$Eatfish,c(5,6,7,12)] scatterplotMatrix( data.matrix(pl) + runif(nrow(pl)*ncol(pl), -0.5, 0.5) ) ## Fish eating questions with some random noise, Dk pl <- surv[surv$Country == 3 & surv$Eatfish,c(5,6,7,12)] scatterplotMatrix( data.matrix(pl) + runif(nrow(pl)*ncol(pl), -0.5, 0.5) ) ## Fish eating questions with some random noise, EST pl <- surv[surv$Country == 4 & surv$Eatfish,c(5,6,7,12)] scatterplotMatrix( data.matrix(pl) + runif(nrow(pl)*ncol(pl), -0.5, 0.5) ) ## Baltic herring questions with some random noise pl <- surv[surv$Eatherr,13:16] scatterplotMatrix( data.matrix(pl) + runif(nrow(pl)*ncol(pl), -0.5, 0.5) ) ## Baltic salmon questions with some random noise pl <- surv[surv$Eatsalm,8:11] scatterplotMatrix( data.matrix(pl) + runif(nrow(pl)*ncol(pl), -0.5, 0.5) ) ##################### CORRELATION MATRIX temp <- sapply(survey, as.numeric) # Can be done for surv to get a smaller matrix survey_correlations <- (cor(temp, method="spearman", use="pairwise.complete.obs")) temp <- colnames(survey_correlations) melted_correlations <- melt(survey_correlations) melted_correlations$Var1 <- factor(melted_correlations$Var1, levels=temp) melted_correlations$Var2 <- factor(melted_correlations$Var2, levels=temp) melted_correlations$value <- ifelse(melted_correlations$value >= 0.99,NA,melted_correlations$value) ggplot(melted_correlations, aes(x = Var1, y = Var2, fill = value, label= round(value, 2)))+ geom_raster()+ theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4))+ scale_fill_gradient2(low = "#480610", mid = "#FFFFFF", high = "#06480F", midpoint = 0, space = "Lab", guide = "colourbar") ############################### PRINCIPAL COORDINATE ANALYSIS (PCoA) #tämä osa valmistaa sen datan. hypocols1 <- c(46:49,95:98) answ <- sapply(survey[hypocols1], FUN=as.numeric) answ <- as.matrix(answ[!is.na(rowSums(answ)),]) pcoa_caps <- capscale(t(answ) ~ 1, distance="euclidean") ##PCoA done ## Kuva koko hypoteeseista colstr <- c("palevioletred1","royalblue1","seagreen1","violet","khaki2","skyblue", "orange") #hypo_sizes <- (5 - colMeans(answ)) #leg_sizes <- c(4, 3, 2, 1, 0.01) #pdf(file="pcoa_plot.pdf", height=6, width=7.5) plot(pcoa_caps, display = c("sp", "wa"), type="n")#, xlim=c(-6,4.5)) ## PCoA biplot, full scale points(pcoa_caps, display= c("sp"), col="gray40") # adding the people points points(pcoa_caps, display= c("wa"), pch=19)#, cex=hypo_sizes, col=trait.cols) text(pcoa_caps, display=c("wa"), srt=25, cex=0.5) #legend(x=-6, y=3.8, levels(traits), fill=colstr, bty="n", cex=1) #legend(-6, -2, legend=c("Very likely", "Moderately likely", # "No opinion", "Moderately unlikely", "Very unlikely"), # pch=21, pt.cex = leg_sizes, bty="n", cex=1) #dev.off() |
Bayes model
- Model run 3.3.2017. All variables assumed independent. [6]
- Model run 3.3.2017. p has more dimensions. [7]
- Model run 25.3.2017. Several model versions: strange binomial+multivarnormal, binomial, fractalised multivarnormal [8]
- Model run 27.3.2017 [9]
- Other models except multivariate normal were archived and removed from active code 29.3.2017.
- Model run 29.3.2017 with raw data graphs [10]
- Model run 29.3.2017 with salmon and herring ovariables stored [11]
- Model run 13.4.2017 with first version of coordinate matrix and principal coordinate analysis [12]
- Model run 20.4.2017 [13] code works but needs a safety check against outliers
- Model run 21.4.2017 [14] some model results plotted
- Model run 21.4.2017 [15] ovariables produced by the model stored.
- Model run 18.5.2017 [16] small updates
# This is code Op_en7749/bayes on page [[Goherr: Fish consumption study]] library(OpasnetUtils) library(ggplot2) library(reshape2) library(rjags) library(car) library(vegan) library(MASS) #library(gridExtra) # Error: package ‘gridExtra’ was built before R 3.0.0: please re-install it # Fish intake in humans # Data from data.frame survey from page [[Goherr: Fish consumption study]] objects.latest("Op_en7749", "preprocess2") # [[Goherr: Fish consumption study]]: survey, surv, ... cat("Version with multivariate normal.\n") # Development needs: ## Correlation between salmon.often and herring.often needs to be estimated. ## Gender, country and age-spesific values should be estimated. mod <- textConnection(" model{ for(i in 1:S) { survs[i,1:4] ~ dmnorm(mus[], Omegas[,]) } for(j in 1:H) { survh[j,1:4] ~ dmnorm(muh[], Omegah[,]) } mus[1:4] ~ dmnorm(mu0[1:4], S2[1:4,1:4]) Omegas[1:4,1:4] ~ dwish(S3[1:4,1:4],S) anss.pred ~ dmnorm(mus[], Omegas[,]) muh[1:4] ~ dmnorm(mu0[1:4], S2[1:4,1:4]) Omegah[1:4,1:4] ~ dwish(S3[1:4,1:4],H) ansh.pred ~ dmnorm(muh[], Omegah[,]) } ") jags <- jags.model( mod, data = list( survs = surv[surv$Eatsalm,c(8:11)], S = sum(surv$Eatsalm), survh = surv[surv$Eatherr,c(13:16)], H = sum(surv$Eatherr), mu0 = rep(2,4), S2 = diag(4)/100000, S3 = diag(4)/10000 ), n.chains = 4, n.adapt = 100 ) update(jags, 100) samps.j <- jags.samples( jags, c('mus', 'Omegas', 'anss.pred','muh','Omegah','ansh.pred'), 1000 ) js <- array( c( samps.j$mus[,,1], samps.j$Omegas[,1,,1], samps.j$Omegas[,2,,1], samps.j$Omegas[,3,,1], samps.j$Omegas[,4,,1], samps.j$anss.pred[,,1], samps.j$muh[,,1], samps.j$Omegah[,1,,1], samps.j$Omegah[,2,,1], samps.j$Omegah[,3,,1], samps.j$Omegah[,4,,1], samps.j$ansh.pred[,,1] ), dim = c(4,1000,6,2), dimnames = list( Question = 1:4, Iter = 1:1000, Parameter = c("mu","Omega1", "Omega2", "Omega3", "Omega4", "ans.pred"), Fish = c("Salmon", "Herring") ) ) # Mu for all questions about salmon scatterplotMatrix(t(js[,,1,1])) # All parameters for question 1 about salmon scatterplotMatrix(js[1,,,1]) # Same for herring scatterplotMatrix(js[1,,,2]) jsd <- melt(js) #ggplot(jsd, aes(x=value, colour=Question))+geom_density()+facet_grid(Parameter ~ Fish) #ggplot(as.data.frame(js), aes(x = anss.pred, y = Sampled))+geom_point()+stat_ellipse() coda.j <- coda.samples( jags, c('mus', 'Omegas', 'anss.pred'), 1000 ) plot(coda.j) ######## fish.param contains expected values of the distribution parameters from the model fish.param <- list( mu = apply(js[,,1,], MARGIN = c(1,3), FUN = mean), Omega = lapply( 1:2, FUN = function(x) { solve(apply(js[,,2:5,], MARGIN = c(1,3,4), FUN = mean)[,,x]) } # solve matrix: precision->covariace ) ) objects.store(fish.param) cat("List fish.param stored.\n") |
Initiate ovariables
Amount estimated from a bayesian model.
- Model run 24.5.2017 [17]
# This is code Op_en7749/modeljsp on page [[Goherr: Fish consumption study]] library(OpasnetUtils) jsp <- Ovariable( "jsp", dependencies = data.frame(Name = "fish.param", Ident = "Op_en7749/bayes"), formula = function(...) { require(MASS) require(reshape2) jsp <- lapply( 1:2, FUN = function(x) { mvrnorm(openv$N, fish.param$mu[,x], fish.param$Omega[[x]]) } ) jsp <- rbind( cbind( Fish = "Salmon", Iter = 1:nrow(jsp[[1]]), as.data.frame(jsp[[1]]) ), cbind( Fish = "Herring", Iter = 1:nrow(jsp[[2]]), as.data.frame(jsp[[2]]) ) ) jsp <- melt(jsp, id.vars = c("Iter", "Fish"), variable.name = "Question", value.name = "Result") jsp <- Ovariable(output=jsp, marginal = colnames(jsp) %in% c("Iter", "Fish", "Question")) return(jsp) } ) objects.store(jsp) cat("Ovariable jsp stored.\n") |
Amount estimates directly from data rather than from a bayesian model.
- Initiation run 18.5.2017 [18]
# This is code Op_en7749/surveyjsp on page [[Goherr: Fish consumption study]] # The code produces amount esimates (jsp ovariable) directly from data rather than bayesian model. library(OpasnetUtils) library(reshape2) objects.latest("Op_en7749", code_name="preprocess2") # original survey data (surv) sur <- survey[c(157,1,3,158,16,29,30,31,46:49,86,95:98)] #colnames(sur) #[1] "Row" "Country" "Gender" "Ages" #[5] "Eat.fish" "How.often.fish" "Eat.salmon" "Baltic.salmon" #[9] "How.often.BS" "How.much.BS" "How.often.side.BS" "How.much.side.BS" #[13] "Eat.BH" "How.often.BH" "How.much.BH" "How.often.side.BH" #[17] "How.much.side.BH" colnames(sur)[c(1,8:17)] <- c("Iter",rep(as.character(c(5,1:4)),2)) sur[8:17] <- sapply(sur[8:17], as.numeric) sur <- rbind(cbind(Fish="Herring", sur[-(8:12)]), cbind(Fish="Salmon", sur[-(13:17)]) ) sur <- melt( sur, measure.vars=as.character(1:5), variable.name="Question", value.name="Result" ) sur$Result[is.na(sur$Result)] <- 1 # Ovariable often becomes never -> amount becomes 0. jsp <- Ovariable( "jsp", output = sur, marginal = colnames(sur) %in% c("Fish", "Iter", "Question") ) objects.store(jsp) cat("Ovariable jsp with actual survey data: each respondent is an iteration.\n") |
Initiate other ovariables
- Code stores ovariables assump, often, much, oftenside, muchside, amount.
- Model run 19.5.2017 [19]
- Initiation run 24.5.2017 without jsp [20]
- Model run 8.6.2017 [21]
# This is code Op_en7749/initiate on page [[Goherr: Fish consumption study]] library(OpasnetUtils) # Combine modelled survey answers with estimated amounts and frequencies by: # Rounding the modelled result and merging that with value in ovariable assump often <- Ovariable( "often", dependencies = data.frame(Name=c("jsp","assump")), formula = function(...) { out <- jsp[jsp$Question == "1" , !colnames(jsp@output) %in% c("Question")] out$Value <- round(result(out)) out <- merge( assump@output[assump$Variable == "freq",], out@output ) out <- out[!colnames(out) %in% c("Value", "Variable", "Result")] colnames(out)[colnames(out) == "assumpResult"] <- "Result" return(out) } ) much <- Ovariable( "much", dependencies = data.frame(Name=c("jsp","assump")), formula = function(...) { out <- jsp[jsp$Question == "2" , !colnames(jsp@output) %in% c("Question")] out$Value <- round(result(out)) out <- merge( assump@output[assump$Variable == "amdish",], out@output ) out <- out[!colnames(out) %in% c("Value", "Variable", "Result")] colnames(out)[colnames(out) == "assumpResult"] <- "Result" return(out) } ) oftenside <- Ovariable( "oftenside", dependencies = data.frame(Name=c("jsp","assump")), formula = function(...) { out <- jsp[jsp$Question == "3" , !colnames(jsp@output) %in% c("Question")] out$Value <- round(result(out)) out <- merge( assump@output[assump$Variable == "freq",], out@output ) out <- out[!colnames(out) %in% c("Value", "Variable", "Result")] colnames(out)[colnames(out) == "assumpResult"] <- "Result" return(out) } ) muchside <- Ovariable( "muchside", dependencies = data.frame(Name=c("jsp","assump")), formula = function(...) { out <- jsp[jsp$Question == "4" , !colnames(jsp@output) %in% c("Question")] out$Value <- round(result(out)) out <- merge( assump@output[assump$Variable == "amside",], out@output ) out <- out[!colnames(out) %in% c("Value", "Variable", "Result")] colnames(out)[colnames(out) == "assumpResult"] <- "Result" return(out) } ) assump <- Ovariable( "assump", ddata = "Op_en7749", subset = "Assumptions for calculations" ) amount <- Ovariable( "amount", dependencies = data.frame(Name = c( "often", "much", "oftenside", "muchside", "assump" )), formula = function(...) { away <- c( "assumpUnit", "Eat.fish", "How.often.fish", "Eat.salmon" ) often <- often[ , !colnames(often@output) %in% away] much <- much[ , !colnames(much@output) %in% away] oftenside <- oftenside[ , !colnames(oftenside@output) %in% away] muchside <- muchside[ , !colnames(muchside@output) %in% away] assump <- assump[assump$Variable == "ingredient", !colnames(assump@output) %in% c("Variable", "Value", "Explanation", "assumpUnit")] out <- (often * much + oftenside * muchside * assump)/365 # g /d return(out) } ) objects.store(assump, often, much, oftenside, muchside, amount) cat("Ovariables assump, often, much, oftenside, muchside, amount stored.\n") |
Dependencies
The survey data will be used as input in the benefit-risk assessment of Baltic herring and salmon intake, which is part of the WP5 work in Goherr-project.
See also
- Useful information about Wishart distribution and related topics:
Keywords
References
Related files