1. Introduction.
Statistics cannot be done without data, and in multivariate statistics, data is primarily presented in matrices, where we usually place the objects studied in the rows and the variables in the columns.
Once we have the data, we are interested in providing a good description of it, which can be done numerically or graphically.
Matrix algebra and matrix calculus are fundamental to multivariate analysis.
Describing multivariate data involves specifying the data matrix, providing measures of central tendency and measures of dispersion, specifically variance-covariance matrices or global measures of variability.
Another topic that plays a significant role in multivariate analysis is the concept of distance: distances between objects, between variables, etc. Distances are the core foundation for applying multivariate methods such as cluster analysis or correspondence analysis.
In terms of graphical representation, aside from classic graphical representations like histograms or scatterplots, we will explore how to create multiple box plots and address the delicate issue of outliers, how to identify them and mitigate their negative effects.
Finally, regarding matrix algebra and matrix calculus, it is necessary to understand how matrices are added, multiplied, what the diagonal is, the trace of a matrix or determinant, and most importantly, their properties. In particular, diagonalization of a matrix and singular value decomposition are two concepts that are absolutely necessary. Essentially, this involves a factorization of the matrix.