Recommended R functions: Difference between revisions
Jump to navigation
Jump to search
(→Answer: recommended structures added) |
(→Recommended functions: multivarplot created based on heande:Kuovesi) |
||
Line 28: | Line 28: | ||
* For listing several similar tables that should be bound rowwise, use standard tables described in [[Using summary tables]]. | * For listing several similar tables that should be bound rowwise, use standard tables described in [[Using summary tables]]. | ||
===Recommended functions=== | ===Recommended generic functions=== | ||
{| class="wikitable sortable" {{prettytable}} | {| class="wikitable sortable" {{prettytable}} | ||
Line 70: | Line 70: | ||
|---- | |---- | ||
|} | |} | ||
===Recommended tailored functions=== | |||
'''Code tailoredfunctions: | |||
* '''multivarplot''' produces a graph for multiple variables along the same X axis (usually a timeline). | |||
<rcode name="tailoredfunctions"> | |||
##### multivarplot produces a graph for multiple variables along the same X axis (usually a timeline). Parameters: | |||
# a: Data.frame that has three columns: DateTime = x axis value, TagName = name of the variable, and Value = y axis value for the variable. | |||
# precision: a smoothing parameter (0 = no smoothing) | |||
# timeline: TRUE if DateTime has POSIXct format, FALSE if real number. | |||
multivarplot <- function(a, precision = 0, timeline = FALSE) { | |||
par(mar=c(5, length(levels(a$TagName)) * 3.5 + 1.5, 4, 4) + 0.1) | |||
x <- 0 | |||
for (i in levels(a$TagName)) { | |||
if(i != levels(a$TagName)[1]) par(new = TRUE) | |||
plot(if(precision == 0) list(x = a$DateTime[a$TagName == i], y = a$Value[a$TagName == i]) else loess.smooth(a$DateTime[a$TagName == i], a$Value[a$TagName == i], | |||
degree = 1, span = precision), axes = FALSE, xlab = "", ylab = "", type = "l", col = rainbow(length(levels(a$TagName)))[x + 1], main = "", | |||
xlim = c(min(a$DateTime), max(a$DateTime)), ylim = c(min(a$Value[a$TagName == i]) - sd(a$Value[a$TagName == i]) * | |||
0.1, max(a$Value[a$TagName == i]) + sd(a$Value[a$TagName == i]) * 0.1)) | |||
axis(2, col = rainbow(length(levels(a$TagName)))[x + 1],lwd = 2, line = x * 3.5) | |||
mtext(2, text = i, line = x * 3.5 + 2, col = rainbow(length(levels(a$TagName)))[x + 1]) | |||
x <- x + 1 | |||
} | |||
if(timeline){axis.POSIXct(1, a$DateTime)} else {axis(1, a$DateTime)} | |||
mtext("Time", side = 1, col = "black", line = 2) | |||
} | |||
</rcode> | |||
==Rationale== | ==Rationale== |
Revision as of 16:09, 30 December 2011
Moderator:Jouni (see all) |
This page is a stub. You may improve it into a full page. |
Upload data
|
Recommended R functions describes good practices for writing R code. The code should be short, straightforward to understand, efficient to run, and similar to everyone else's code.
Question
What are good practices for writing R code? The code should be short, straightforward to understand, efficient to run, and similar to everyone else's code.
Answer
Recommended structures
- When possible, use data.frames rather than arrays, tables or lists.
- Standard columns for data.frames:
- id: row identifier in the res table of Opasnet Base. Usually not needed and this can be sliced away.
- obs: row identifier in a data table or one uploaded piece of data. Technically, a piece of data that has the same series_id in Opasnet Base.
- iter: identifier of the iteration in a Monte Carlo simulation.
- There are several standard indices in Opasnet Base such as Year, Sex, Age, Lo (longitude), La (latitude),... If possible, use these names. For a full reference, see Opasnet Base Indices.
- Result: the actual result of the object, typically numeric. There are also other columns used for results, namely
- Result.Text: in Opasnet Base, results that are in text format are stored in this column, and
- Freq: when tapply() function is used, the summarised result is given in Freq column.
- Unit: If the unit can be different at different rows, a separate column Unit is needed.
- Description: Description can contain any descriptive information about the row. It is not used in calculations.
- For health impacts, there are standard tables to be used. For details, see Health impact assessment.
- For time tracking of working hours, there are standard tables to be used. For details, see op_fi:Aikakone.
- For listing several similar tables that should be bound rowwise, use standard tables described in Using summary tables.
Recommended generic functions
What to do | Functions to use routinely | Functions to avoid except in special cases | Examples and description |
---|---|---|---|
Manipulate data | data.frame | array | |
Draw raphs | ggplot, plot | ggplot requires library(ggplot2) | |
Summarise data along a criterion | tapply | ||
Join two data.frames | merge | IntArray | |
Add rows to a data.frame | rbind | ||
Add columns to a data.frame | cbind | ||
Transform a table from long to wide or vice versa | reshape | ||
Get data from Opasnet Base | op_baseGetData | Requires library(OpasnetBaseUtils) > op_baseGetData("opasnet_base", "Op_en4523")[, -c(1,2)] # Gets the object with identifier Op_en4523 and slices columns 1 and 2 (id, obs) away. | |
Get index values from Opasnet Base | op_baseGetLocs | Requires library(OpasnetBaseUtils) | |
Write data to Opasnet Base | op_baseWrite | Requires library(OpasnetBaseUtils). Only works from a THL computer, not R-tools | |
Slicing R objects | data[rows, cols], data$col | This is not a function but rather a list of practical ways of slicing an object. | |
match | |||
ifelse | |||
is.na | |||
colnames | |||
Convert between data types | as.numeric, as.character, as.factor | ||
Convert between object types | as.data.frame, as.table |
Recommended tailored functions
Code tailoredfunctions:
- multivarplot produces a graph for multiple variables along the same X axis (usually a timeline).
Rationale
Based on experience and testing.
See also
References
Related files
<mfanonymousfilelist></mfanonymousfilelist>