Title: | Datasets and Functions for the Class "Modelling and Data Analysis for Pharmaceutical Sciences" |
---|---|
Description: | Provides datasets and functions for the class "Modelling and Data Analysis for Pharmaceutical Sciences". The datasets can be used to present various methods of data analysis and statistical modeling. Functions for data visualization are also implemented. |
Authors: | Lionel Voirol [aut, cre], Stéphane Guerrier [aut], Yuming Zhang [aut], Luca Insolia [aut] |
Maintainer: | Lionel Voirol <[email protected]> |
License: | AGPL-3 |
Version: | 0.0.4 |
Built: | 2025-03-08 04:30:32 UTC |
Source: | https://github.com/cran/idarps |
boxplot_w_points
boxplot_w_points( ..., col_points = "#9033FF3F", col_boxplot = "#d2d2d2", horizontal = FALSE, main = "", names = NULL, las = 0, xlab = "", ylab = "", seed = 123, jitter_param = 0.25 )
boxplot_w_points( ..., col_points = "#9033FF3F", col_boxplot = "#d2d2d2", horizontal = FALSE, main = "", names = NULL, las = 0, xlab = "", ylab = "", seed = 123, jitter_param = 0.25 )
... |
data vectors to be visualized. |
col_points |
color of the points to be added to the boxplot. |
col_boxplot |
color of the boxplot. |
horizontal |
logical indicating if the boxplots should be horizontal; default FALSE means vertical boxes. |
main |
string indicating the title of the plot. |
names |
vector of string indicating the group labels which will be printed under each boxplot. |
las |
a numeric value indicating the orientation of the tick mark labels and any other text added to a plot after its initialization. The options are as follows: always parallel to the axis (the default, 0), always horizontal (1), always perpendicular to the axis (2), and always vertical (3). |
xlab |
a string indicating the x label. |
ylab |
a string indicating the y label. |
seed |
an integer specifying a seed for the random jitter of the boxplot points. |
jitter_param |
a double specifying the amount of jittering applied on points. |
No return value. Plot a boxplot.
x <- rnorm(20, mean = 5) y <- rnorm(20, mean = 10) z <- rnorm(20, mean = 15) boxplot_w_points(x, main = "test") boxplot_w_points(x, y, names = c("x", "y"), las = 1, main = "Data") boxplot_w_points(x, y, z, names = c("x", "y", "z"), horizontal = TRUE, las = 1, main = "Data") boxplot_w_points(x, y, z, names = c("x", "y", "z"), horizontal = FALSE, las = 1, main = "Data")
x <- rnorm(20, mean = 5) y <- rnorm(20, mean = 10) z <- rnorm(20, mean = 15) boxplot_w_points(x, main = "test") boxplot_w_points(x, y, names = c("x", "y"), las = 1, main = "Data") boxplot_w_points(x, y, z, names = c("x", "y", "z"), horizontal = TRUE, las = 1, main = "Data") boxplot_w_points(x, y, z, names = c("x", "y", "z"), horizontal = FALSE, las = 1, main = "Data")
This dataset consists of several clinical features observed or measured for 116 participants in a study of breast cancer.
BreastCancer
BreastCancer
Age in years
Body mass index in kg/
Glucose in mg/dL
Insulin in U/mL
Homeostasis model assessment
Presence of breast cancer (0 if no cancer, 1 if with cancer)
https://bmccancer.biomedcentral.com/articles/10.1186/s12885-017-3877-1
Patricio, Miguel, et al. "Using Resistin, glucose, age and BMI to predict the presence of breast cancer", BMC Cancer, (2018).
Data collected in a study to assess the effects of smoking and pollution on being diagnosed with bronchitis. This dataset is based on 212 subjects.
bronchitis
bronchitis
Presence of bronchitis (0 for no and 1 for yes)
Average daily number of smoked cigarettes
Pollution index
This dataset is based on an observational study conducted at Geneva University Hospitals to assess the impact of weight on the pharmacokinetics of dexamethasone in normal-weight versus obese patients hospitalized for COVID-19.
codex
codex
ID of the patient
Gender (0 for men and 1 for women)
Age
Body mass index
Weight in kg
Number of doses of the dexamethasone (DEX) drug
The time it takes for the drug to reach the maximum concentration (i.e. Cmax) after its administration in hours (h)
The maximum concentration that achieves in the blood after the drug has been administered (ng/m)
t1_2 is the time required to decrease the drug concentration within the body by one-half during elimination in hours (h)
The integral (from 0 to 8 hours) of a curve that describes the variation of a drug concentration in the blood as a function of time it takes for a drug to reach the maximum concentration (Cmax) after administration of a drug (ng.h/m)
Number of days the patient were hospitalized
Number of days the patient were hospitalized at the intermediate and intensive care unit
crp
Presence of cormobidity type e
Presence of cormobidity type p
Presence of cormobidity type v
Presence of cormobidity type c
Presence of cormobidity type r
Indicator variable based on whether the subject is obese (i.e. with BMI > 30), 0 for no and 1 for yes.
This dataset contains measured biomarkers in pigs fed with various diets.
cortisol
cortisol
A data frame with 61 rows and 9 variables:
the id of the pig
the diet fed to the pig (chipped diet or non-chipped diet)
the gender of the pig
urine costisol in pg/ml
serum acth in pg/ml
serum crh in pg/ml
testosterone in ng/ml
LH in ng/ml
daily caloric intake in kcal
Data from Parisi, et al., (2021) which studies the applicability of predictive models for intensive care admission of COVID-19 patients in a secondary care hospital in Belgium. This study is based on data of patients admitted to an emergency department with a positive RT-PCR SARS-CoV-2 test.
covid
covid
A data frame with 64 rows and 5 variables:
admission to an Intensive Care Unit (0 for no, 1 for yes)
sex (men, women)
age in years
lactate dehydrogenase in U/L
oxygen saturation in percentage
https://jeccm.amegroups.org/article/view/6927/html
Parisi, Nicolas, et al. "Non applicability of validated predictive models for intensive care admission and death of COVID-19 patients in a secondary care hospital in Belgium.", Journal of Emergency and Critical Care Medicine, (2021).
Data from the COVID-19 Data Hub joined with spatial features for Switzerland.
data_covid_switzerland_spatial
data_covid_switzerland_spatial
Country
3-letter code of the country according to the standard ISO 3166-1 Alpha-3
Date
Cumulative number of confirmed cases
Total population
Cumulative number of tests
Daily number of confirmed cases
Daily number of tests
Number of daily confirmed cases divided per the country population
Moving Average applied to confirmed_per_pop with a window of 7 days
'sf' geometry list of country
This dataset contains reports of diabetes symptoms from 520 individuals, encompassing symptoms potentially associated with the condition. It was compiled through a questionnaire aimed at recently diagnosed diabetics or individuals displaying one or more symptoms. Data collection took place via direct questionnaire at Sylhet Diabetes Hospital in Bangladesh.
diabetes
diabetes
Age of the patient in years
Gender of the patient (Male, Female)
Presence of polyuria (excessive urination) (Yes, No)
Presence of polydipsia (excessive thirst) (Yes, No)
Presence of sudden weight loss (Yes, No)
Presence of weakness (Yes, No)
Presence of polyphagia (excessive hunger) (Yes, No)
Presence of genital thrush (Yes, No)
Presence of visual blurring (Yes, No)
Presence of itching (Yes, No)
Presence of irritability (Yes, No)
Presence of delayed healing (Yes, No)
Presence of partial paresis (Yes, No)
Presence of muscle stiffness (Yes, No)
Presence of alopecia (Yes, No)
Presence of obesity (Yes, No)
Diagnosis class (1 if presence of diabetes, 0 otherwise)
https://link.springer.com/chapter/10.1007/978-981-13-8798-2_12
Islam, M. M. F., et al. "Likelihood prediction of diabetes at early stage using data mining techniques", Computer vision and machine intelligence in medical image analysis, (2020).
Diet
diet
diet
ID
Gender (male or female)
Age in years
Height in m
Type of diet (A, B or C)
Initial weight in kg
Final weight in kg
This dataset is based on a study conducted in suburban Boston in the late 1970s to investigate the relationship between forced expiratory volume and smoking behavior in 654 youths between the ages of 3 and 19.
fev
fev
forced expiratory volume or FEV, which measures the amount of air a person can exhale during a forced breath.
age in years
gender of the person (0 for males and 1 for females)
height in cm
smoking behavior (0 for non-smokers and 1 for smokers)
hist_compare_to_normal
hist_compare_to_normal( x, col = "lightgray", main = "", xlab = "", ylab = "", lwd_line = 1.5, col_line1 = "#ff160e", col_line2 = "#335bff", add_legend = TRUE, legend_position = "topleft", delta = 0.2, ... )
hist_compare_to_normal( x, col = "lightgray", main = "", xlab = "", ylab = "", lwd_line = 1.5, col_line1 = "#ff160e", col_line2 = "#335bff", add_legend = TRUE, legend_position = "topleft", delta = 0.2, ... )
x |
data vector to be visualized. |
col |
color of the histogram. |
main |
string indicating the title of the plot. |
xlab |
a string indicating the x label. |
ylab |
a string indicating the y label. |
lwd_line |
width of density lines. |
col_line1 |
color of density line classic mle estimation. |
col_line2 |
color of density line classic robust estimation. |
add_legend |
a Boolean if the estimated parameters of the Normal distribution should be plotted. |
legend_position |
a string specifying the position of the legend. |
delta |
graphic parameter to determine the shrinkage of the axis. |
... |
Extra graphical arguments. |
No return value. Plot a histogram.
n <- 1000 x <- rnorm(n = n) hist_compare_to_normal(x) x2 <- rexp(n, rate = 25) hist_compare_to_normal(x2, legend_position = "topright")
n <- 1000 x <- rnorm(n = n) hist_compare_to_normal(x) x2 <- rexp(n, rate = 25) hist_compare_to_normal(x2, legend_position = "topright")
Data from an experiment made on rats which compares the HP13C bicarbonate signal intensities normalized to the total sum of metabolites and corresponding initial reaction rate as a function of the injected dose of HP1-13C pyruvate. Two groups of rats were compared (i.e. fed and overnight-fasted). Dataset from Can et al. 2022.
HP13Cbicarbonate
HP13Cbicarbonate
HP13C bicarbonate signal intensities normalized to the total sum of metabolites
initial reaction rate as a function of the injected dose of HP13C pyruvate
fed and overnight-fasted
https://www.nature.com/articles/s42003-021-02978-2
This dataset contains a collection of variables believed to be potentially associated with the blood pressure measurements of 213 individuals from Kuwait. The dataset lists the following variables:
kuwait_bp
kuwait_bp
Age in years
Weight in kg
Height in mm
Chin skinfold in cm
Forearm skinfold in cm
Calf skinfold in cm
Resting pulse rate
Whether or not the participant is left-handed
The Body Mass Index (BMI) of the participant
Systolic blood pressure
This dataset consists of variables possibly relating to blood pressures of 39 Peruvians who have moved from rural high-altitude areas to urban lower-altitude areas.
PeruvianBP
PeruvianBP
Age in years
Years in urban area
Weight in kg
Height in mm
Chin skinfold
Forearm skinfold
Calf skinfold
Resting pulse rate
Systolic blood pressure
This dataset contains the number of clients in a pharmacy for each hour over two years.
pharmacy
pharmacy
A data frame with 17520 rows and 4 variables:
the date
the hour of the day
the week day
the recorded number of clients
This dataset is based on the effectiveness of directed reading activities for elementary school students (6-12 years old).
reading
reading
Student id
Degree of Reading Power (DRP) test score
Age of the students
Binary variable indicating whether a student participated to the directed reading activities (Treatment if the student participated, Control otherwise)
This dataset is based on a study on the physical and behavioral characteristics of snorers.
snoring
snoring
gender of the person (0 for males and 1 for females)
age in years
height in cm
weight in kg
smoking behavior (0 for non-smokers and 1 for smokers)
number of glasses drunk per day (in red wine equivalent)
snoring diagnosis (0 for not snoring, 1 for snoring)