Data Science & Engineering into Food Science: A novel Big Data Platform for Low Molecular Weight Gelators’ Behavioral Analysis
Keywords:big data, data science, food science, coloidal science, low molecular weight gelators, self assembly
The objective of this article is to introduce a comprehensive
end-to-end solution aimed at enabling the application
of state-of-the-art Data Science and Analytic
methodologies to a food science related problem. The
problem refers to the automation of load, homogenization,
complex processing and real-time accessibility to
low molecular-weight gelators (LMWGs) data to gain
insights into their assembly behavior, i.e. whether a
gel can be mixed with an appropriate solvent or not.
Most of the work within the field of Colloidal and
Food Science in relation to LMWGs have centered on
identifying adequate solvents that can generate stable
gels and evaluating how the LMWG characteristics can
affect gelation. As a result, extensive databases have
been methodically and manually registered, storing
results from different laboratory experiments. The
complexity of those databases, and the errors caused
by manual data entry, can interfere with the analysis
and visualization of relations and patterns, limiting the
utility of the experimental work.
Due to the above mentioned, we have proposed a
scalable and flexible Big Data solution to enable the
unification, homogenization and availability of the data
through the application of tools and methodologies.
This approach contributes to optimize data acquisition
during LMWG research and reduce redundant data processing
and analysis, while also enabling researchers
to explore a wider range of testing conditions and push
forward the frontier in Food Science research.
Universidad Argentina de la Empresa (UADE), “Instituto de Tecnología (INTEC) - Proyectos,” 2020. [online; accessed on March 2020].
M. A. Rogers, Q. Feng, V. Ladizhansky, D. B. Good, A. K. Smith, M. Corridini, D. A. S. Grahame, B. C. Bryksa, P. D. Jadhav, S. Sammynaiken, L.-T. Lim, B. Guild, Y. Y. Shim, P.-G. Burnett, and M. J. T. Reaney, “Self-assembled fibrillar networks comprised of a naturally-occurring cyclic peptide–LOB3,” RSC Advances, vol. 6, no. 47, pp. 40765–40776, 2016.
J. L. Murphy and G. Zarza, La ingeniería del Big Data, Cómo trabajar con datos. Barcelona: Editorial UOC, Oct. 2017.
Y. Lan, M. G. Corradini, R. G. Weiss, S. R. Raghavan, and M. A. Rogers, “To gel or not to gel: correlating molecular gelation with solvent parameters,” Chemical Society Reviews, vol. 44, no. 17, pp. 6035–6058, 2015.
J. Kreps, “Questioning the Lambda Architecture,” July 2014. [online; accessed on March 2020].
C. Santo Domingo, M. Gozzi, G. Zarza, T. Tecce, M. Rogers, and M. Corradini, “Data Science Computational Strategies in Food Science and Engineering,” tech. rep., Instituto de Tecnología (INTEC), UADE, 2018.
R. Kimball and M. Ross, The Data Warehouse Toolkit. Wiley John & Sons, 2013.
E. Ries, “Minimum Viable Product: a guide,” Aug. 2009. [online; accessed on March 2020].
D. Cearley, “Advance Cloud Computing Capabilities,” 2018. [online; accessed on March 2020].
Amazon, “AWS Educate,” 2015. [online; accessed on March 2020].
K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The Hadoop Distributed File System,” in 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), IEEE, may 2010.
Apache Software Foundation, “Apache Hadoop,” 2006. [online; accessed on March 2020].
Apache Software Foundation, “Apache Hive,” 2011. [online; accessed on March 2020].
Amazon, “Amazon S3,” 2020. [online; accessed on March 2020].
G. Dodig-Crnkovic, “Scientific methods in computer science,” in Proceedings of the Conference for the Promotion of Research in IT at New Universities and at University Colleges in Sweden, Skövde, Suecia, pp. 126–130, 2002. [online; accessed on March 2020].