Alférez Salinas, Germán Harvey

Document Type


Publication Date



Bi-plots are commonly used in geochemical analyses. However, their use can become cumbersome in the case of multi-variate analyses. Therefore, this thesis explores the application of unsupervised machine learning techniques, specifically PCA and K-Means, to analyze large geochemical data sets from two distinct regions, Hawaii and the \acrfull{prb} in Southern California. The IBM Foundational Methodology for Data Science was utilized to ensure proper data preparation and analysis. PCA provided dimensionality reduction, revealing which features correlated most strongly with variances within the data. K-Means clustering allowed for deeper interpretation of the data. The analysis yielded valuable insights into the composition and differentiation of magma and rocks from the two regions. Future work should include a deeper analysis of the clusters and a determination of how geochemical plots relate to underlying geochemical processes. These results could be helpful in relating "catastrophic" magmatic processes and geochemistry with the Genesis record.