OpasnetBaseUtils: Difference between revisions
mNo edit summary |
mNo edit summary |
||
(32 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{tool|moderator=Teemu R|stub=Yes}} | {{tool|moderator=Teemu R|stub=Yes}} | ||
OpasnetBaseUtils is a collection of [[R]] functions for interaction with the [[Opasnet Base]] and manipulating data of multiple variables with multiple matching or unmatching dimensions, fitted into a neat package. | [[Category:Code under inspection]] | ||
{{attack|# |The code on this page should be built into the OpasnetUtils package and then the page should be merged with [[OpasnetUtils]]. |--[[User:Jouni|Jouni]] 18:45, 15 May 2013 (EEST)}} | |||
==Question== | |||
OpasnetBaseUtils is a collection of [[R]] functions for interaction with the [[Opasnet Base]] and manipulating data of multiple variables with multiple matching or unmatching dimensions, fitted into a neat package. What should such a package contain? | |||
==Answer== | |||
OpasnetBaseUtils contains the following functions. The functions are described in detail elsewhere (follow links). | |||
* [[Opasnet Base Connection for R#Downloading data|op_baseGetData()]] | |||
* [[Opasnet Base Connection for R#Finding index data|op_baseGetLocs()]] | |||
* [[Opasnet Base Connection for R#Uploading data|op_baseWrite()]] | |||
* These functions are outdated. They are only available for compatibility issues related to old code. | |||
** [http://en.opasnet.org/en-opwiki/index.php?title=Operating_intelligently_with_multidimensional_arrays_in_R&oldid=18412 IntArray()] (and related [http://en.opasnet.org/en-opwiki/index.php?title=Talk:Operating_intelligently_with_multidimensional_arrays_in_R&oldid=17403 discussion]) This function has been replaced by merge(). | |||
** [http://en.opasnet.org/en-opwiki/index.php?title=Opasnet_Base_Connection_for_R&oldid=22176#Manipulating_data DataframeToArray()]. This function was used before because many calculations were made to arrays. More recently, calculations are done directly to data.frames, and they are rarely translated into arrays. It is more common to translate arrays to data.frames using as.data.frame(as.table(array)). | |||
===Rcode generic=== | |||
* Functions: dropall, PTable, opasnet.data, tidy, summary.bring | |||
<rcode name="generic"> | |||
###################################### | |||
## dropall pudottaa data.framesta pois kaikki faktorien sellaiset levelit, joita ei käytetä. | |||
## parametrit: x = data.frame | |||
dropall <- function(x){ | |||
isFac <- NULL | |||
for (i in 1:dim(x)[2]){isFac[i] = is.factor(x[ , i])} | |||
for (i in 1:length(isFac)){ | |||
x[, i] <- x[, i][ , drop = TRUE] | |||
} | |||
return(x) | |||
} | |||
######################################## | |||
######################################### | |||
## PTable muuntaa arvioinnin todennäköisyystaulun sopivaan muotoon arviointia varten. | |||
## Parametrit: P = todennäköisyystaulu Opasnet-kannasta kaivettuna. | |||
## n = iteraatioiden lukumäärä Monte Carlossa | |||
## Todennäköisyystaulun sarakkeiden on oltava: Muuttuja, Selite, Lokaatio, P | |||
## Tuotteena on Monte Carloa varten tehty taulu, jonka sarakkeina ovat | |||
## n (iteraatio) ja kaikki todennäköisyystaulussa olleet selitteet, joiden riveille on arvottu | |||
## lokaatiot niiden todennäköisyyksien mukaisesti, jotka todennäköisyystaulussa oli annettu. | |||
PTable <- function(P, n) { | |||
Pt <- unique(P[,c("Muuttuja", "Selite")]) | |||
Pt <- data.frame(Muuttuja = rep(Pt$Muuttuja, n), Selite = rep(Pt$Selite, n), obs = rep(1:n, each = nrow(Pt)), P = runif(n*nrow(Pt), 0, 1)) | |||
for(i in 2:nrow(P)){P$Result[i] <- P$Result[i] + ifelse(P$Muuttuja[i] == P$Muuttuja[i-1] & P$Selite[i] == P$Selite[i-1], P$Result[i-1], 0)} | |||
P <- merge(P, Pt) | |||
P <- P[P$P <= P$Result, ] | |||
Pt <- as.data.frame(as.table(tapply(P$Result, as.list(P[, c("Muuttuja", "Selite", "obs")]), min))) | |||
colnames(Pt) <- c("Muuttuja", "Selite", "obs", "Result") | |||
Pt <- Pt[!is.na(P$Result), ] | |||
P <- merge(P, Pt) | |||
P <- P[, !colnames(P) %in% c("Result", "P", "Muuttuja")] | |||
P <- reshape(P, idvar = "obs", timevar = "Selite", v.names = "Lokaatio", direction = "wide") | |||
colnames(P) <- ifelse(substr(colnames(P), 1, 9) == "Lokaatio.", substr(colnames(P), 10,30), colnames(P)) | |||
return(P) | |||
} | |||
###################################### | |||
## opasnet.data downloads a file from Finnish Opasnet wiki, English Opasnet wiki, or Opasnet File. | |||
## Parameters: filename is the URL without the first part (see below), wiki is "opasnet_en", "opasnet_fi", or "M-files". | |||
## If table is TRUE then a table file for read.table function is assumed; all other parameters are for this read.table function. | |||
opasnet.data <- function(filename, wiki = "opasnet_en", table = FALSE, ...) | |||
{ | |||
if (wiki == "opasnet_en") { | |||
file <- paste("http://en.opasnet.org/en-opwiki/images/", filename, sep = "") | |||
} | |||
if (wiki == "opasnet_fi") { | |||
file <- paste("http://fi.opasnet.org/fi_wiki/images/", filename, sep = "") | |||
} | |||
if (wiki == "M-files") { | |||
file <- paste("http://http://fi.opasnet.org/fi_wiki/extensions/mfiles/", filename, sep = "") | |||
} | |||
#if(table == TRUE) { | |||
#file <- re#ad.table(file, header = FALSE, sep = "", quote = "\"'", | |||
# dec = ".", row.names, col.names, | |||
# as.is = !stringsAsFactors, | |||
# na.strings = "NA", colClasses = NA, nrows = -1, | |||
# skip = 0, check.names = TRUE, fill = !blank.lines.skip, | |||
# strip.white = FALSE, blank.lines.skip = TRUE, | |||
# comment.char = "#", | |||
# allowEscapes = FALSE, flush = FALSE, | |||
# stringsAsFactors = default.stringsAsFactors(), | |||
# fileEncoding = "", encoding = "unknown") | |||
#return(file) | |||
#} | |||
#else {return(ge#tURL(file))} | |||
} | |||
############ tidy: a function that cleans the tables from Opasnet Base | |||
# data is a table from op_baseGetData function | |||
tidy <- function (data, idvar = "obs", direction = "long") { | |||
data$Result <- ifelse(!is.na(data$Result.Text), as.character(data$Result.Text), data$Result) | |||
if("Observation" %in% colnames(data)){test <- data$Observation != "Description"} else {test <- TRUE} | |||
data <- data[test, !colnames(data) %in% c("id", "Result.Text")] | |||
if("obs.1" %in% colnames(data)) {data[, "obs"] <- data[, "obs.1"]} # this line is temporarily needed until the obs.1 bug is fixed. | |||
data <- data[colnames(data) != "obs.1"] | |||
if("Row" %in% colnames(data)) { # If user has given Row, it is used instead of automatic obs. | |||
data <- data[, colnames(data) != "obs"] | |||
colnames(data)[colnames(data) == "Row"] <- "obs" | |||
} | |||
if(direction == "wide" & "Observation" %in% colnames(data)) | |||
{ | |||
data <- reshape(data, idvar = idvar, timevar = "Observation", v.names = "Result", direction = "wide") | |||
data <- data[colnames(data) != "obs"] | |||
colnames(data) <- gsub("^Result.", "", colnames(data)) | |||
colnames(data)[colnames(data) == "result"] <- "Result" | |||
colnames(data)[colnames(data) == "Amount"] <- "Result" | |||
} | |||
else | |||
{ | |||
data <- data[colnames(data) != "obs"] | |||
} | |||
return(data) | |||
} | |||
############### summary.bring: Bring parts of summary table | |||
# page is the page identifier for the summary table. | |||
summary.bring <- function(page, base = "opasnet_base"){ | |||
data <- tidy(op_baseGetData(base, page)) | |||
pages <- levels(data$Page) | |||
## temp contains the additional information that is not on the actual data table. | |||
temp <- data[, !colnames(data) == "Observation"] | |||
temp <- reshape(temp, idvar = "Page", timevar = "Index", direction = "wide") | |||
colnames(temp) <- ifelse(substr(colnames(temp), 1, 7) == "Result.", substr(colnames(temp), 8, 50), colnames(temp)) | |||
## Get all data tables one at a time and combine them. | |||
for(i in 1:length(pages)){ | |||
out <- op_baseGetData("opasnet_base", pages[i]) | |||
out <- tidy(out) | |||
cols <- colnames(out)[!colnames(out) %in% c("Observation", "Result")] | |||
out <- reshape(out, timevar = "Observation", idvar = cols, direction = "wide") | |||
colnames(out) <- ifelse(substr(colnames(out), 1, 7) == "Result.", substr(colnames(out), 8, 50), colnames(out)) | |||
out <- merge(temp[temp$Page == pages[i], ][colnames(temp) != "Page"], out) | |||
## Check that all data tables have all the same columns before you combine them with rbind. | |||
if(i == 1){out2 <- out} else { | |||
addcol <- colnames(out2)[!colnames(out2) %in% colnames(out)] | |||
if(length(addcol) > 0) { | |||
temp <- as.data.frame(array("*", dim = c(1,length(addcol)))) | |||
colnames(temp) <- addcol | |||
out <- merge(out, temp)} | |||
addcol <- colnames(out)[!colnames(out) %in% colnames(out2)] | |||
if(length(addcol) > 0) { | |||
temp <- as.data.frame(array("*", dim = c(1,length(addcol)))) | |||
colnames(temp) <- addcol | |||
out2 <- merge(out2, temp)} | |||
## Combine data tables. | |||
out2 <- rbind(out2, out)} | |||
} | |||
return(out2) | |||
} | |||
########## | |||
</rcode> | |||
[[op_fi:OpasnetBaseUtils]] | |||
[[Heande:OpasnetBaseUtils]] | |||
==Rationale== | |||
A suggestion about the structure and content: | |||
There should be just one package (at least for the time being) from Opasnet developers, namely ''OpasnetUtils''. This contains different things: | |||
* OpasnetBaseUtils for connections to and from [[Opasnet Base]]. | |||
** Suggested function names: opbase.read (previously op_baseGetData), opbase.write (previously op_baseWrite). | |||
{{comment|# |The original distinction between Write and GetData arose from the fact that data isn't the only thing read from the base. GetLocs also exists for getting location info on a particular data set. Of course GetLocs could be renamed locs or locations, but that loses some of the information contained in the function names.|--[[User:Teemu R|Teemu R]] 09:25, 9 May 2011 (EEST)}} | |||
* Functions for some particular tasks needed in Opasnet assessments, such as functions for calculating health impacts from ERF (the function takes in RR or OR or both and automatically calculates a synthesis), exposure and background disease. | |||
** Suggested function names: ophia.lifetable (for life table calculation), ophia.hia (for simple impact calculation), opgis.population (for slicing population data from a database for a case), opmath.sip and opmath.unsip (for turning a random sample into a [[SIPs and SLURPs|SIP]] and a SIP into a random sample, respectively, etc. | |||
* Outdated functions for compatibility reasons, such as [[Operating intelligently with multidimensional arrays in R|IntArray]]. | |||
* Functions or practices for handling uncertain variables: how to merge run/obs index into a data.frame. | |||
If the suggestion is accepted, the following things could be done to organise pages: | |||
* [[:File:OpasnetBaseUtils 0.8.0.zip]] is moved to [[:File:OpasnetUtils.zip]] (version numbers should NOT be in the filename). | |||
{{attack|# |This does not seem wise, my previous experience is that files downloaded from Opasnet are cached in some very special place and I was unable to download the most recent version of a certain file, because of the similar filename. Also I think any programmer would agree that it'd be bad practice to not include an easily accessible version number on the file. Instead we should consider use of some version management system e.g. SVN. |--[[User:Teemu R|Teemu R]] 09:25, 9 May 2011 (EEST)}} | |||
* The content of [[OpasnetBaseUtils]] is copied and the page is redirected to [[:file:OpasnetUtils.zip]]. | |||
* [[OpasnetUtils]] is redirected to [[:file:OpasnetUtils.zip]]. | |||
* [[File:OpasnetUtils.zip]] contains an explanation and links back to the archived pages mentioned above. | |||
== Instructions == | == Instructions == | ||
# Download [[File:OpasnetBaseUtils 0.8. | # Download [[File:OpasnetBaseUtils 0.8.4.zip]] (Save it in a location you can easily find) | ||
# Open [[R]] | # Open [[R]] | ||
# Click "Packages" on the topbar and choose "Install package(s) from local zip files..." from the drop-down menu | # Click "Packages" on the topbar and choose "Install package(s) from local zip files..." from the drop-down menu | ||
# Locate the downloaded .zip file and install | # Locate the downloaded .zip file and install | ||
{{mfiles}} | |||
=== Usage === | === Usage === | ||
Line 24: | Line 213: | ||
[[Category:Opasnet Base]] | [[Category:Opasnet Base]] | ||
=== See also == | === Change log === | ||
Forgot about this earlier so I'll add a change log now. | |||
*0.8.4 - New versions of database up- and download functions added, they now support special characters properly in both opasnet and heande. | |||
== See also == | |||
*[[File:OpasnetBaseUtils v.0.8.4 source.zip|OpasnetBaseUtils sources]] | |||
**To build from source use '''R CMD build <src folder>''' & '''R CMD INSTALL <src folder>''' in a command line on properly configured machines (most Unix systems require no configuration) | |||
{{Opasnet training}} |
Latest revision as of 11:02, 26 August 2013
Moderator:Teemu R (see all) |
This page is a stub. You may improve it into a full page. |
Upload data
|
⇤--#: . The code on this page should be built into the OpasnetUtils package and then the page should be merged with OpasnetUtils. --Jouni 18:45, 15 May 2013 (EEST) (type: truth; paradigms: science: attack)
Question
OpasnetBaseUtils is a collection of R functions for interaction with the Opasnet Base and manipulating data of multiple variables with multiple matching or unmatching dimensions, fitted into a neat package. What should such a package contain?
Answer
OpasnetBaseUtils contains the following functions. The functions are described in detail elsewhere (follow links).
- op_baseGetData()
- op_baseGetLocs()
- op_baseWrite()
- These functions are outdated. They are only available for compatibility issues related to old code.
- IntArray() (and related discussion) This function has been replaced by merge().
- DataframeToArray(). This function was used before because many calculations were made to arrays. More recently, calculations are done directly to data.frames, and they are rarely translated into arrays. It is more common to translate arrays to data.frames using as.data.frame(as.table(array)).
Rcode generic
- Functions: dropall, PTable, opasnet.data, tidy, summary.bring
Rationale
A suggestion about the structure and content:
There should be just one package (at least for the time being) from Opasnet developers, namely OpasnetUtils. This contains different things:
- OpasnetBaseUtils for connections to and from Opasnet Base.
- Suggested function names: opbase.read (previously op_baseGetData), opbase.write (previously op_baseWrite).
----#: . The original distinction between Write and GetData arose from the fact that data isn't the only thing read from the base. GetLocs also exists for getting location info on a particular data set. Of course GetLocs could be renamed locs or locations, but that loses some of the information contained in the function names. --Teemu R 09:25, 9 May 2011 (EEST) (type: truth; paradigms: science: comment)
- Functions for some particular tasks needed in Opasnet assessments, such as functions for calculating health impacts from ERF (the function takes in RR or OR or both and automatically calculates a synthesis), exposure and background disease.
- Suggested function names: ophia.lifetable (for life table calculation), ophia.hia (for simple impact calculation), opgis.population (for slicing population data from a database for a case), opmath.sip and opmath.unsip (for turning a random sample into a SIP and a SIP into a random sample, respectively, etc.
- Outdated functions for compatibility reasons, such as IntArray.
- Functions or practices for handling uncertain variables: how to merge run/obs index into a data.frame.
If the suggestion is accepted, the following things could be done to organise pages:
- File:OpasnetBaseUtils 0.8.0.zip is moved to File:OpasnetUtils.zip (version numbers should NOT be in the filename).
⇤--#: . This does not seem wise, my previous experience is that files downloaded from Opasnet are cached in some very special place and I was unable to download the most recent version of a certain file, because of the similar filename. Also I think any programmer would agree that it'd be bad practice to not include an easily accessible version number on the file. Instead we should consider use of some version management system e.g. SVN. --Teemu R 09:25, 9 May 2011 (EEST) (type: truth; paradigms: science: attack)
- The content of OpasnetBaseUtils is copied and the page is redirected to file:OpasnetUtils.zip.
- OpasnetUtils is redirected to file:OpasnetUtils.zip.
- File:OpasnetUtils.zip contains an explanation and links back to the archived pages mentioned above.
Instructions
- Download File:OpasnetBaseUtils 0.8.4.zip (Save it in a location you can easily find)
- Open R
- Click "Packages" on the topbar and choose "Install package(s) from local zip files..." from the drop-down menu
- Locate the downloaded .zip file and install
<mfanonymousfilelist></mfanonymousfilelist>
Usage
library(OpasnetBaseUtils)
- For function usage notes see the following pages:
Dependencies
- You need to have installed another package called RODBC which in turn requires the utils package. These packages are available from the CRAN repositories and can be easily installed from within R.
Change log
Forgot about this earlier so I'll add a change log now.
- 0.8.4 - New versions of database up- and download functions added, they now support special characters properly in both opasnet and heande.
See also
- File:OpasnetBaseUtils v.0.8.4 source.zip
- To build from source use R CMD build <src folder> & R CMD INSTALL <src folder> in a command line on properly configured machines (most Unix systems require no configuration)
Help pages | Wiki editing • How to edit wikipages • Quick reference for wiki editing • Drawing graphs • Opasnet policies • Watching pages • Writing formulae • Word to Wiki • Wiki editing Advanced skills |
Training assessment (examples of different objects) | Training assessment • Training exposure • Training health impact • Training costs • Climate change policies and health in Kuopio • Climate change policies in Kuopio |
Methods and concepts | Assessment • Variable • Method • Question • Answer • Rationale • Attribute • Decision • Result • Object-oriented programming in Opasnet • Universal object • Study • Formula • OpasnetBaseUtils • Open assessment • PSSP |
Terms with changed use | Scope • Definition • Result • Tool |