Data Analysis in R for Consumer Science
Introduction to the book
1
Introduction to R
1.1
How to get started - understanding R (and RStudio)
1.1.1
Organise and save scripts
1.2
How to import data
1.2.1
Import data from R-package
1.2.2
Importing a csv file
1.2.3
Importing an Excel file/sheet
1.2.4
Clipboard import
1.2.5
Looking at the imported elements
1.2.6
Numbers and factors - changing categorisation
1.3
How to edit and merge datasets
1.3.1
Edit using Tidyverse
1.4
How to save the data
1.5
How to export data / results to Excel and the like
1.6
How to load your RData
1.7
How to clear your environment
1.8
How to R project
1.8.1
How to Create an R Project
2
Libraries
3
Plotting data
3.1
Histograms and boxplots
3.2
Scatter plots
3.3
How to export plots
4
Descriptive statistics
4.1
Descriptives for a continuous variable
4.1.1
Mean / median
4.1.2
Variance
4.1.3
Standard deviation
4.1.4
Calculations
4.2
Distributions of count data
4.3
Aggregate
4.4
Tidyverse
5
Inferential statistics
5.1
Intro
5.2
Hypothesis testing
5.2.1
Power
5.3
Confidence intervals
5.4
T-test
5.5
F-test
5.6
Analysis of Variance (ANOVA)
5.6.1
One-way ANOVA
5.6.2
Two-way ANOVA
5.6.3
Post hoc test - Tukey’s Honest Significant Difference
5.7
Introduction to linear and mixed models
5.8
Normal and Mixed models
5.8.1
Normal model
5.8.2
Mixed model
6
Introduction to PCA and multivariate data
6.1
A bit of math
6.2
Interpreting model output
6.2.1
Biplot
7
Buffet and survey data
7.1
Buffet data
7.1.1
Introduction to buffet data
7.1.2
Plotting buffet data
7.1.3
Mixed model for buffet data
7.2
Survey data
7.2.1
Plotting survey data
7.2.2
Linear model for buffet data
7.2.3
Post-hoc test for survey data
7.3
Combining consumption and survey data
7.4
PCA on survey answers
7.4.1
Wrangle data
7.4.2
Build the model
7.4.3
Bi-plot
7.4.4
Extract the components and run all associations.
8
MST exercises
8.1
Exercise 1 - cleaning Compusense data with R
8.1.1
Exercise 1 - Solution
8.2
Exercise 2 - creating a tableone and visualization
8.2.1
Exercise 2 - solution
8.3
Exercise 3 - Creating an outcome table
8.3.1
Exercise 3 - Solution
9
Intro to large survey data
9.0.1
Looking at the data and checking formating
9.1
Descriptive statistics
9.1.1
Numeric values
9.2
Plots
9.2.1
Within groups of data
9.3
Categorical / Ordinal variables
9.3.1
Tables
9.4
Table 1
9.5
The tidyverse way
10
Consumer segmentation
10.1
Segmentation
10.1.1
Using K-means for clustering
10.1.2
Initial characterization of clusters
10.1.3
Vizualization of the clusters
10.1.4
Sums of Squares from the clustering
10.2
Selecting the number of clusters - a data-driven approach
10.3
Segmentation - another example
10.3.1
Cluster analysis
10.3.2
Plot the model using PCA
10.3.3
Add to dataset
10.3.4
Comments
11
Profiling segments
11.1
Table 1 as a profiling tool
11.2
Visualization of Consumer Segments
11.3
Creating the contingency table
11.4
Plot the numbers
11.5
Contingency table
11.5.1
Pearson Chi-square test
11.6
Correspondence Analysis
12
Logistic Regression
12.1
Segmentation/Clustering
12.2
Fitting the logistic regression-model
12.3
Probabilities of segment membership:
12.4
Odds ratios
12.5
ORs and Probs
12.6
Effect of Age
12.7
Multivariate analysis
12.7.1
Descriptives
12.7.2
Two nested models
12.7.3
Coefficients
12.7.4
Re-level
12.7.5
All pairwise comparisons
12.8
Segment 2 and 3
12.9
A new set of data
12.10
Logistic regression for demographic characterization
12.10.1
Age
12.10.2
Gender
12.11
TASK
12.12
The tidyverse way
12.13
Comment
13
CATA data (Check-All-That-Apply)
13.1
Importing and looking at the beer data
13.2
Two versions of the data
13.3
Cochran’s Q test
13.3.1
Post hoc test
13.3.2
For all attributes in one run (nice to know)
13.4
PCA on CATA data
14
Hedonic rating (e.g. liking scores)
14.1
Plotting liking scores
14.1.1
PCA of hedonic ratings
14.2
Simple mixed models
14.2.1
Post hoc test
14.3
Multivariable models
14.3.1
Additive models
14.3.2
Effect modification and Interactions
15
CATA and Hedonics
15.1
Individual attributes and liking
15.1.1
An example with Refreshing
15.1.2
All attributes
15.2
PCA on CATA and Liking
15.2.1
A beer centric model
15.3
Analysis by PLS
15.3.1
PLS basics
15.3.2
PLS to predict liking based on CATA data
15.3.3
Regression coefficients
15.4
L-PLS
15.5
Creating X1, X2, X3.
15.6
Building the model: The L-PLS model can be built, using the
lpls
15.6.1
Finding the explained variance for exo L-PLS
15.6.2
Cross-validation Next let us have a look how many components we
15.6.3
Vizulization Now we would like to visuzlalize our L-PLS model. When
16
Projective mapping
16.1
Example from mapping of XX
16.2
A Collated version of the data
16.3
PCA on Collated data
16.4
17
TFIH Exercises
17.1
Exercise 1: Descriptive statistics and plots
17.1.1
CATA counts
17.1.2
Hedonics
17.2
Exercise 2: Consumer background and PCA
17.2.1
Demographics
17.2.2
PCA on
Interests
17.3
Exercise 3: PCA on CATA counts
17.4
Exercise 4: Cochran’s Q test on CATA binary data
17.5
Exercise 5: Hedonic ratings and consumer characteristics
17.5.1
PCA on joint data
17.5.2
All demographics
17.6
Exercise 6: PCA on CATA counts and hedonic ratings
17.7
Exercise 7: Mixed modelling on hedonic ratings
18
CHAPTERS to APPEAR
19
Latent Factor Models
20
LPLS
21
Confirmatory Factor Analysis using lavaan
21.1
Example - Food Neophobia
22
Structured Equation Modelling
22.1
Example - Theory of Planned Behaviour
23
PLSDA on CATA and liking
24
Text mining of comsumer reviews
25
Text mining of open-ended survey responses
Published with bookdown
Data Analysis in R for Sensory and Consumer Science
Chapter 18
CHAPTERS to APPEAR