Opasnet Base (2008-2011): Difference between revisions

From Opasnet
Jump to navigation Jump to search
(Opasnet base moved to Opasnet Base structure: The variable is about the structure. There should be an encyclopedia article about the Base itself.)
 
(technical edits)
 
(19 intermediate revisions by 3 users not shown)
Line 1: Line 1:
#REDIRECT [[Opasnet Base structure]]
[[Category:Opasnet Base]]
[[Category:Data]]
[[Category:Opasnet]]
[[Category:Open assessment]]
[[Category:SQL tool]]
[[Category:Tool]]
[[Category:Glossary term]]
{{encyclopedia|moderator = Jouni}}
 
:''This page is about the '''old structure of Opasnet Base''' used in ca. 2008-2011. For a description about the current database, see [[Opasnet Base]].
 
This page is a general description about '''Opasnet Base'''. For an additional description with screenshots, see an [[:Image:Opasnet Base description.odt|Open Office document]]. For a detailed description of its structure, see [[Opasnet Base structure]].
<section begin=glossary />
:'''Opasnet Base''' is a part of [[Opasnet]] and a storage and retrieval system for [[result|results]] of [[variable|variables]] and [[data]] from [[study|studies]]. It is designed to be flexible enough to store information in almost any format: probability distributions or deterministic point estimates; spatially or temporally distributed data; or data with multiple dimensions. It can be used as a direct source of model input data, thus making it possible to use shared input information sources such as population data, climate scenarios, or dose-responses of pollutants. Opasnet Base can be accessed via links on variable and study pages (e.g. the metadata box), via a [http://base.opasnet.org web interface] and via the model [[:image:Opasnet base connection.ANA|Opasnet base connection.ANA]].
<section end=glossary />
 
[[Uploading to Opasnet Base]] helps you understand what data could and should be updated to Opasnet Base and what the recommended data structures and formats are. For technical instructions how to use the current upload software, see [[Opasnet Base connection]]. For a general description about the database, see this page and for technical details about the database, see [[Opasnet Base structure]]. For details about downloading data, see [[Opasnet Base UI]].
 
== User interface for downloading data==
 
The easiest way to use Opasnet Base is to find the variable or study of interest from Opasnet, and then click links either on the meta data box in the top right corner, or on the Result section. They will lead you to a page where you can select which details of the variable you want to see.
 
[[image:User intarface of Opasnet Base.png|thumb|center|600px|Figure. User interface of Opasnet Base.]]
 
The results in Opasnet Base are typically not single numbers but tables with several indices (also called dimensions or determinants). For example, food intake results may be given for several age groups, and both sexes separately. In this case, sex and age are used as indices. Because many variables may have a lot of indices, also the results may be large multi-dimensional tables. This makes it difficult to find the particular numbers the user is interested in.
 
To overcome this problem, the user interface makes it possible to select only those rows of each index that are of interest. It is also possible to select the samplesize, i.e. how many iterations are shown for each value. The output of Opasnet Base is a table that first lists technical information such are the row number of the result in the database, the number of the iteration, and the row description for each index. The last column, Result, contains the actual value of the variable.
 
It is possible to copy and paste the resulting table to another software. It is also possible to save the result first as a comma separated value (CSV) file for further use.
 
==Some uses of Opasnet Base==
 
===Storage of interpreted model results===
 
Originally, Opasnet base was designed to serve as the storage for interpreted model results, i.e. [[variable]] [[result]]s. Variables attempt to answer specific real-world questions, and their results are the current best estimates as the answers to these questions. This is different to [[study|studies]] that report the observations from a single study. Variable results are expected to be eternally improving in time, while data from a study is fixed after the study has been done and the observations made.
 
===Storage of study results===
 
'''Opasnet Base''' can be used to collect observation data from [[study|studies]]. A study can be a traditional research study, documented in Opasnet Data afterwards, or it can be an Opasnet study where the data is collected on a particular Opasnet page with a web form. There can be several purposes to using Opasnet Base as a data repository:
* To collect observation data to be directly and conveniently usable in interpretations of [[variable]]s and other [[object]]s.
* To collectively collect information about specific cases for the purpose of conditionalizing generalised assessment models with data specific to particular cases.
 
However, there are some things about variables and studies that should be understood:
* The object for a collection of observations is called a [[study]], while the object of interpretations is called a [[variable]]. As an example, a study can collect information about a population group by a questionnaire and by taking a blood sample. The study identifier is the Obj.id in the Opasnet Base.
* The object may be divided into smaller pieces along one or more [[index|indices]]. For example, the questionnaire may have 30 questions, and therefore the questionnaire data can be indexed by an index with 30 columns (or rows, depending on which way you think), one row for each question. Each column of the study object has one cell, i.e. an answer to one question. For example, if ten blood markers will be studied along with 30 questions,the study object will have 40 cells, and the index has 40 columns (30 from the questionnaire, 10 from the blood sample). The cell identifier is the Cell.id in the Opasnet Base.
* For each individual patient, there is one row of observations (in the example each 40 cells). The observation row identifier is Res.Sample in Opasnet Base.
* The actual result of a particular cell of a particular patient is located in Sam.Result in Opasnet Base (or in Res_info.Description in the case where the result is text, i.e. non-numeric).
* Each study may be multidimensional just like a variable and have indices along e.g. space, time, or sex.
* If the data is collected using an Opasnet web form, then the timestamp and username or IP will be recorded for each entry into Resinfo.When and Resinfo.Who fields, respectively. This is not needed, if the data comes from a previously performed study (which is static data in the eyes of Opasnet).
* In some cases, it  might be useful to restrict the number of entries per user to one. However, this is done only at the interpretation phase where only the last entry is counted. There are no restrictions to enter new data, and therefore a user may change his/her previous entry by simply making a new entry.
 
===Downloading results automatically to Analytica model===
 
It is possible to automatically download values from Opasnet Base into an Analytica model. To do this, you must have [[:file:Opasnet Base connection.ANA|Opasnet Base connection.ANA]] module uploaded into your model, and you must have Analytica Enterprise version. Then you use
 
<anacode>
get_sample(variable_identifier)
</anacode>
 
or other functions to call values from Opasnet Base. In this way it is possible to keep data in Opasnet Base so that every time the model is run, it will use the most up-to-date values from the database. If the results are then uploaded to the database, they are automatically coherent with other results in the database. On the other hand, if you always want to use a particular version of data, that can be defined in the Analytica get_sample function as an additional parameter. In this way, you can always get results that are consistent with your previous results (but not necessarily with the newest information about the topic).
 
===Making value-of-information analyses in Opasnet base===
 
[[Value of information]] (VOI) is a [[:en:decision analysis|decision analysis]] tool for estimating the importance of remaining uncertainty for decision-making. Opasnet base can be used to perform a large number of VOI analyses, because all variables are in the right format for that: as random samples from uncertain variables. The analysis is done by optimising an [[indicator]] variable by adjusting a [[decision variable]] so that the variable under analysis is conditionalised to different values. All this can in theory be done in the Opasnet base by just listing the indicator, the decision variable, and the variable of interest. Practical tools should be developed for this. After that, systematic VOI analyses can be made over a wide range of environmental health issues.
 
===Analysing the change in the quality of a variable result in Opasnet base===
 
All results that have once been stored in the Opasnet base remain there. Old results can be very interesting for some purposes:
* The time trend of [[informativeness]] and [[calibration]] (see [[performance]]) can be evaluated for a single variable against the newest information.
* Critical pieces of information that had a major impact on the informativeness and calibration can be identified afterwards.
* Large number of variables can be assessed and e.g. following questions can be asked:
** How much work is needed to make a variable with reasonable performance for practical applications?
** What are the critical steps after which the variable performance is saturated, i.e., does not improve much despite additional effort?
 
==See also==
 
* [[:Image:Opasnet Base description.odt|Open Office document]] about Opasnet Base
* [[Opasnet Base structure]]
* [[Opasnet]]
* [[Open assessment]]
* [[:image:Opasnet base connection.ANA|Opasnet base connection.ANA]]
* [http://www.csc.fi/english/csc/news/news/e-infra-strategy CSC e-infrastructure strategy]

Latest revision as of 18:28, 10 April 2015



This page is about the old structure of Opasnet Base used in ca. 2008-2011. For a description about the current database, see Opasnet Base.

This page is a general description about Opasnet Base. For an additional description with screenshots, see an Open Office document. For a detailed description of its structure, see Opasnet Base structure. <section begin=glossary />

Opasnet Base is a part of Opasnet and a storage and retrieval system for results of variables and data from studies. It is designed to be flexible enough to store information in almost any format: probability distributions or deterministic point estimates; spatially or temporally distributed data; or data with multiple dimensions. It can be used as a direct source of model input data, thus making it possible to use shared input information sources such as population data, climate scenarios, or dose-responses of pollutants. Opasnet Base can be accessed via links on variable and study pages (e.g. the metadata box), via a web interface and via the model Opasnet base connection.ANA.

<section end=glossary />

Uploading to Opasnet Base helps you understand what data could and should be updated to Opasnet Base and what the recommended data structures and formats are. For technical instructions how to use the current upload software, see Opasnet Base connection. For a general description about the database, see this page and for technical details about the database, see Opasnet Base structure. For details about downloading data, see Opasnet Base UI.

User interface for downloading data

The easiest way to use Opasnet Base is to find the variable or study of interest from Opasnet, and then click links either on the meta data box in the top right corner, or on the Result section. They will lead you to a page where you can select which details of the variable you want to see.

Figure. User interface of Opasnet Base.

The results in Opasnet Base are typically not single numbers but tables with several indices (also called dimensions or determinants). For example, food intake results may be given for several age groups, and both sexes separately. In this case, sex and age are used as indices. Because many variables may have a lot of indices, also the results may be large multi-dimensional tables. This makes it difficult to find the particular numbers the user is interested in.

To overcome this problem, the user interface makes it possible to select only those rows of each index that are of interest. It is also possible to select the samplesize, i.e. how many iterations are shown for each value. The output of Opasnet Base is a table that first lists technical information such are the row number of the result in the database, the number of the iteration, and the row description for each index. The last column, Result, contains the actual value of the variable.

It is possible to copy and paste the resulting table to another software. It is also possible to save the result first as a comma separated value (CSV) file for further use.

Some uses of Opasnet Base

Storage of interpreted model results

Originally, Opasnet base was designed to serve as the storage for interpreted model results, i.e. variable results. Variables attempt to answer specific real-world questions, and their results are the current best estimates as the answers to these questions. This is different to studies that report the observations from a single study. Variable results are expected to be eternally improving in time, while data from a study is fixed after the study has been done and the observations made.

Storage of study results

Opasnet Base can be used to collect observation data from studies. A study can be a traditional research study, documented in Opasnet Data afterwards, or it can be an Opasnet study where the data is collected on a particular Opasnet page with a web form. There can be several purposes to using Opasnet Base as a data repository:

  • To collect observation data to be directly and conveniently usable in interpretations of variables and other objects.
  • To collectively collect information about specific cases for the purpose of conditionalizing generalised assessment models with data specific to particular cases.

However, there are some things about variables and studies that should be understood:

  • The object for a collection of observations is called a study, while the object of interpretations is called a variable. As an example, a study can collect information about a population group by a questionnaire and by taking a blood sample. The study identifier is the Obj.id in the Opasnet Base.
  • The object may be divided into smaller pieces along one or more indices. For example, the questionnaire may have 30 questions, and therefore the questionnaire data can be indexed by an index with 30 columns (or rows, depending on which way you think), one row for each question. Each column of the study object has one cell, i.e. an answer to one question. For example, if ten blood markers will be studied along with 30 questions,the study object will have 40 cells, and the index has 40 columns (30 from the questionnaire, 10 from the blood sample). The cell identifier is the Cell.id in the Opasnet Base.
  • For each individual patient, there is one row of observations (in the example each 40 cells). The observation row identifier is Res.Sample in Opasnet Base.
  • The actual result of a particular cell of a particular patient is located in Sam.Result in Opasnet Base (or in Res_info.Description in the case where the result is text, i.e. non-numeric).
  • Each study may be multidimensional just like a variable and have indices along e.g. space, time, or sex.
  • If the data is collected using an Opasnet web form, then the timestamp and username or IP will be recorded for each entry into Resinfo.When and Resinfo.Who fields, respectively. This is not needed, if the data comes from a previously performed study (which is static data in the eyes of Opasnet).
  • In some cases, it might be useful to restrict the number of entries per user to one. However, this is done only at the interpretation phase where only the last entry is counted. There are no restrictions to enter new data, and therefore a user may change his/her previous entry by simply making a new entry.

Downloading results automatically to Analytica model

It is possible to automatically download values from Opasnet Base into an Analytica model. To do this, you must have Opasnet Base connection.ANA module uploaded into your model, and you must have Analytica Enterprise version. Then you use

<anacode> get_sample(variable_identifier) </anacode>

or other functions to call values from Opasnet Base. In this way it is possible to keep data in Opasnet Base so that every time the model is run, it will use the most up-to-date values from the database. If the results are then uploaded to the database, they are automatically coherent with other results in the database. On the other hand, if you always want to use a particular version of data, that can be defined in the Analytica get_sample function as an additional parameter. In this way, you can always get results that are consistent with your previous results (but not necessarily with the newest information about the topic).

Making value-of-information analyses in Opasnet base

Value of information (VOI) is a decision analysis tool for estimating the importance of remaining uncertainty for decision-making. Opasnet base can be used to perform a large number of VOI analyses, because all variables are in the right format for that: as random samples from uncertain variables. The analysis is done by optimising an indicator variable by adjusting a decision variable so that the variable under analysis is conditionalised to different values. All this can in theory be done in the Opasnet base by just listing the indicator, the decision variable, and the variable of interest. Practical tools should be developed for this. After that, systematic VOI analyses can be made over a wide range of environmental health issues.

Analysing the change in the quality of a variable result in Opasnet base

All results that have once been stored in the Opasnet base remain there. Old results can be very interesting for some purposes:

  • The time trend of informativeness and calibration (see performance) can be evaluated for a single variable against the newest information.
  • Critical pieces of information that had a major impact on the informativeness and calibration can be identified afterwards.
  • Large number of variables can be assessed and e.g. following questions can be asked:
    • How much work is needed to make a variable with reasonable performance for practical applications?
    • What are the critical steps after which the variable performance is saturated, i.e., does not improve much despite additional effort?

See also