International Journal of Computer Science and Data Engineering https://csdb.sciforce.org/CSDB <p>Advancing Knowledge in International Journal of Computer Science and Data Engineering: About Computer Science and Database Applications (CSDA) by Sciforce Publications</p> <p>Welcome to International Journal of Computer Science and Data Engineering (CSDA), an esteemed publication by Sciforce Publications. CSDA serves as a dynamic platform for disseminating cutting-edge research, innovative technologies, and transformative ideas in the fields of computer science and database applications. In this "About Us" section, we will provide an overview of CSDA, its mission, and its dedication to fostering advancements in computer science and database technologies.<strong> </strong></p> Sciforce Publications en-US International Journal of Computer Science and Data Engineering 3066-6813 Performance Optimization of Data Vault 2.0 Implementation Using Linear Regression Analysis https://csdb.sciforce.org/CSDB/article/view/267 <p>This study examines the effectiveness of data warehouse implementation methods in addressing modern data warehousing challenges, particularly focusing on performance optimization through linear regression analysis. Data warehouses serve as critical repositories for management decision-making, yet they face growing challenges due to increasing data volume, velocity, and heterogeneity. Traditional data warehouse approaches provide limited capabilities for processing non-standard data types, while Data Warehouse 2.0 offers enhanced flexibility, scalability, and operational efficiency. This research analyzes 350 observations and examines four key parameters: source system load time, number of loaded records, transformation complexity score, and load performance index. Descriptive statistics reveal significant variation across operational conditions, with source system load times averaging 50 seconds and ranging from 11 to 96 seconds. The number of loaded records varies from 2,124 to almost 100,000, while the transformation complexity scores range from 0.09 to 123.64. Correlation analysis demonstrates a strong positive relationship (r = 0.84) between the source system load time and the load performance index, indicating that longer processing times are associated with higher performance indices. Conversely, the number of loaded records shows a moderate negative relationship with performance (r = -0.49), indicating a potential bottleneck with large datasets. Linear regression modeling achieved impressive results with R² values ​​of 0.9224 for the training data and 0.8483 for the test data, demonstrating strong predictive accuracy and effective generalization. The model's low error metrics (RMSE of 9.99 for training and 10.53 for testing) confirm its reliability in predicting data vault performance based on operational parameters, providing valuable insights for optimizing data warehouse implementations.</p> Rajender Radharam Copyright (c) 2024 Rajender Radharam https://creativecommons.org/licenses/by-nc/4.0 2024-12-16 2024-12-16 1 3 1 6 10.55124/csdb.v1i3.267