Listen
11 Ergebnisse
Ergebnisse nach Hochschule und Institut
Publikation Evaluation of synthetic data generators on complex tabular data(Springer, 2024) Thees, Oscar; Novak, Jiri; Templ, Matthias; Domingo-Ferrer, Josep; Önen, MelekSynthetic data generators are widely utilized to produce synthetic data, serving as a complement or replacement for real data. However, the utility of data is often limited by its complexity. The aim of this paper is to show their performance using a complex data set that includes cluster structures and complex relationships. We compare different synthesizers such as synthpop, Synthetic Data Vault, simPop, Mostly AI, Gretel, Realtabformer, and arf, taking into account their different methodologies with (mostly) default settings, on two properties: syntactical accuracy and statistical accuracy. As a complex and popular data set, we used the European Statistics on Income and Living Conditions data set. Almost all synthesizers resulted in low data utility and low syntactical accuracy. The results indicated that for such complex data, simPop, a computational and methodological framework for simulating complex data based on conditional modeling, emerged as the most effective approach for static tabular data and is superior compared to other conditional or joint modelling approaches.04B - Beitrag KonferenzschriftPublikation A new version of the Langelier-Ludwig square diagram under a compositional perspective(Elsevier, 2022) Templ, Matthias; Gozzi, Caterina; Buccianti, Antonella01A - Beitrag in wissenschaftlicher ZeitschriftPublikation Statistical analysis of chemical element compositions in food science: problems and possibilities(MDPI, 2022) Templ, Matthias; Templ, BarbaraIn recent years, many analyses have been carried out to investigate the chemical components of food data. However, studies rarely consider the compositional pitfalls of such analyses. This is problematic as it may lead to arbitrary results when non-compositional statistical analysis is applied to compositional datasets. In this study, compositional data analysis (CoDa), which is widely used in other research fields, is compared with classical statistical analysis to demonstrate how the results vary depending on the approach and to show the best possible statistical analysis. For example, honey and saffron are highly susceptible to adulteration and imitation, so the determination of their chemical elements requires the best possible statistical analysis. Our study demonstrated how principle component analysis (PCA) and classification results are influenced by the pre-processing steps conducted on the raw data, and the replacement strategies for missing values and non-detects. Furthermore, it demonstrated the differences in results when compositional and non-compositional methods were applied. Our results suggested that the outcome of the log-ratio analysis provided better separation between the pure and adulterated data and allowed for easier interpretability of the results and a higher accuracy of classification. Similarly, it showed that classification with artificial neural networks (ANNs) works poorly if the CoDa pre-processing steps are left out. From these results, we advise the application of CoDa methods for analyses of the chemical elements of food and for the characterization and authentication of food products.01A - Beitrag in wissenschaftlicher ZeitschriftPublikation Prof. Rudolf Dutter (1946-2023): Ein Nachruf(Austrian Statistical Society, 07/2023) Filzmoser, Peter; Templ, MatthiasDer ehemalige TU Wien Professor Rudolf Dutter verstarb am 5. Mai 2023 an den Folgen seiner langjährigen Diabetes-Erkrankung. Prof. Dutter war von 1997 bis 2003 Redakteur der Österreichischen Zeitschrift für Statistik (Austrian Journal of Statistics), und diese Tätigkeit hat er mit viel Engagement im Sinne der Österreichischen Statistischen Gesellschaft geleistet. Eine seiner Aktivitäten war die Einrichtung und der Betrieb einer Website für die Zeitschrift, die einen "Open Access" Zugriff auf die Artikel ermöglichte. Ein kurzer Nachruf in dieser Zeitschrift, auch als Information für die Mitglieder der Gesellschaft, scheint daher mehr als passend zu sein.01A - Beitrag in wissenschaftlicher ZeitschriftPublikation A systematic overview on methods to protect sensitive data provided for various analyses(Springer, 2022) Templ, Matthias; Sariyar, Murat01A - Beitrag in wissenschaftlicher ZeitschriftPublikation Enhancing precision in large-scale data analysis: an innovative robust imputation algorithm for managing outliers and missing values(MDPI, 2023) Templ, MatthiasNavigating the intricate world of data analytics, one method has emerged as a key tool in confronting missing data: multiple imputation. Its strength is further fortified by its powerful variant, robust imputation, which enhances the precision and reliability of its results. In the challenging landscape of data analysis, non-robust methods can be swayed by a few extreme outliers, leading to skewed imputations and biased estimates. This can apply to both representative outliers – those true yet unusual values of your population – and non-representative outliers, which are mere measurement errors. Detecting these outliers in large or high-dimensional data sets often becomes as complex as unraveling a Gordian knot. The solution? Turn to robust imputation methods. Robust (imputation) methods effectively manage outliers and exhibit remarkable resistance to their influence, providing a more reliable approach to dealing with missing data. Moreover, these robust methods offer flexibility, accommodating even if the imputation model used is not a perfect fit. They are akin to a well-designed buffer system, absorbing slight deviations without compromising overall stability. In the latest advancement of statistical methodology, a new robust imputation algorithm has been introduced. This innovative solution addresses three significant challenges with robustness. It utilizes robust bootstrapping to manage model uncertainty during the imputation of a random sample; it incorporates robust fitting to reinforce accuracy; and it takes into account imputation uncertainty in a resilient manner. Furthermore, any complex regression or classification model for any variable with missing data can be run through the algorithm. With this new algorithm, we move one step closer to optimizing the accuracy and reliability of handling missing data. Using a realistic data set and a simulation study including a sensitivity analysis, the new alogorithm imputeRobust shows excellent performance compared with other common methods. Effectiveness was demonstrated by measures of precision for the prediction error, the coverage rates, and the mean square errors of the estimators, as well as by visual comparisons.01A - Beitrag in wissenschaftlicher ZeitschriftPublikation Can we ignore the compositional nature of compositional data by using deep learning aproaches?(Pearson, 2021) Templ, Matthias; Perna, Cira; Salvati, Nicola; Schirripa Spagnolo, Francesco04B - Beitrag KonferenzschriftPublikation Artificial neural networks to impute rounded zeros in compositional data(Springer, 2021) Templ, Matthias; Filzmoser, Peter; Hron, Karel; Martín-Fernández, Josep Antoni; Palarea-Albaladejo, Javier04A - Beitrag SammelbandPublikation Coincidence of temperature extremes and phenological events of grapevines(Institut des Sciences de la Vigne et du Vin (I S V V), 2021) Templ, Barbara; Templ, Matthias; Barbieri, Roberto; Meier, Michael; Zufferey, VivianA growing number of studies have highlighted the consequences of climate change on agriculture, including the impacts of climate extremes such as drought, heat waves and frost. The aim of this study was to assess the influence of temperature extremes on various phenological events of grapevine varieties in Southwest Switzerland (Leytron, Canton of Valais). We aimed to capture the occurrence of extreme events in specific years in various grapevine varieties and at different phenological phases to rank the varieties based on their sensitivity to temperature extremes and thus quantify their robustness. Phenological observations (1978–2018) of six Vitis vinifera varieties (Arvine, Chardonnay, Chasselas, Gamay, Pinot noir, and Syrah) were subjected to event coincidence analysis. Extreme events were defined as values in the uppermost or lowermost percentiles of the timing of the phenophases and daily temperatures within a 30-day window before the phenophase event occurred. Significantly more extreme temperature and phenological events occurred in Leytron between 2003 and 2017 than in the earlier years, with the years 2007, 2011, 2014 and 2017 being remarkable in terms of the number of extreme coincidence events. Moreover, bud development and flowering experienced significantly more extreme coincidence events than other phenophases; however, the occurrence rate of extreme coincidence events was independent of the phenophase. Based on the total number of extreme events, the varieties did not differ in their responses to temperature extremes. Therefore, event coincidence analysis is an appropriate tool to quantify the occurrence of extreme events. The occurrence of extreme temperature events clearly affected the advancement of the timings of phenological events in various grapevines. However, there were no varietal differences in terms of response to extreme temperatures; thus, additional research is warranted to outline the best adaptation measures.01A - Beitrag in wissenschaftlicher ZeitschriftPublikation Comparison of zero replacement strategies for compositional data with large numbers of zeros(Elsevier, 2021) Lubbe, Sugnet; Filzmoser, Peter; Templ, Matthias01A - Beitrag in wissenschaftlicher Zeitschrift