Movatterモバイル変換


[0]ホーム

URL:


Next Article in Journal
Enhancing Accessibility to Analytics Courses in Higher Education through AI, Simulation, and e-Collaborative Tools
Next Article in Special Issue
On a Simplified Approach to Achieve Parallel Performance and Portability Across CPU and GPU Architectures
Previous Article in Journal
Assessment of Published Papers on the Use of Machine Learning in Diagnosis and Treatment of Mastitis
Previous Article in Special Issue
Top-Down Models across CPU Architectures: Applicability and Comparison in a High-Performance Computing Environment
 
 
Search for Articles:
Title / Keyword
Author / Affiliation / Email
Journal
Article Type
 
 
Section
Special Issue
Volume
Issue
Number
Page
 
Logical OperatorOperator
Search Text
Search Type
 
add_circle_outline
remove_circle_outline
 
 
Journals
Information
Volume 15
Issue 8
10.3390/info15080429
Font Type:
ArialGeorgiaVerdana
Font Size:
AaAaAa
Line Spacing:
Column Width:
Background:
Article

Revisiting Database Indexing for Parallel and Accelerated Computing: A Comprehensive Study and Novel Approaches

1
Applied Research Institute, Polytechnic Institute of Coimbra, 3045-093 Coimbra, Portugal
2
Instituto de Telecomunicações, 6201-001 Covilhã, Portugal
3
Department of Informatics, Polytechnic of Viseu, 3504-510 Viseu, Portugal
*
Author to whom correspondence should be addressed.
Information2024,15(8), 429;https://doi.org/10.3390/info15080429
Submission received: 6 June 2024 /Revised: 20 July 2024 /Accepted: 22 July 2024 /Published: 24 July 2024
(This article belongs to the Special IssueAdvances in High Performance Computing and Scalable Software)

Abstract

:
While the importance of indexing strategies for optimizing query performance in database systems is widely acknowledged, the impact of rapidly evolving hardware architectures on indexing techniques has been an underexplored area. As modern computing systems increasingly leverage parallel processing capabilities, multi-core CPUs, and specialized hardware accelerators, traditional indexing approaches may not fully capitalize on these advancements. This comprehensive experimental study investigates the effects of hardware-conscious indexing strategies tailored for contemporary and emerging hardware platforms. Through rigorous experimentation on a real-world database environment using the industry-standard TPC-H benchmark, this research evaluates the performance implications of indexing techniques specifically designed to exploit parallelism, vectorization, and hardware-accelerated operations. By examining approaches such as cache-conscious B-Tree variants, SIMD-optimized hash indexes, and GPU-accelerated spatial indexing, the study provides valuable insights into the potential performance gains and trade-offs associated with these hardware-aware indexing methods. The findings reveal that hardware-conscious indexing strategies can significantly outperform their traditional counterparts, particularly in data-intensive workloads and large-scale database deployments. Our experiments show improvements ranging from 32.4% to 48.6% in query execution time, depending on the specific technique and hardware configuration. However, the study also highlights the complexity of implementing and tuning these techniques, as they often require intricate code optimizations and a deep understanding of the underlying hardware architecture. Additionally, this research explores the potential of machine learning-based indexing approaches, including reinforcement learning for index selection and neural network-based index advisors. While these techniques show promise, with performance improvements of up to 48.6% in certain scenarios, their effectiveness varies across different query types and data distributions. By offering a comprehensive analysis and practical recommendations, this research contributes to the ongoing pursuit of database performance optimization in the era of heterogeneous computing. The findings inform database administrators, developers, and system architects on effective indexing practices tailored for modern hardware, while also paving the way for future research into adaptive indexing techniques that can dynamically leverage hardware capabilities based on workload characteristics and resource availability.

    1. Introduction

    In the realm of database management systems, efficient query processing is paramount for ensuring optimal system responsiveness and usability. Indexing strategies play a crucial role in this pursuit, facilitating rapid data retrieval and significantly influencing query execution performance. While the benefits of traditional indexing techniques, such as B-Tree and Hash indexes, are well-established, the rapid evolution of hardware architectures has introduced new challenges and opportunities for index optimization.
    As modern computing systems increasingly leverage parallel processing capabilities, multi-core CPUs, and specialized hardware accelerators like graphics processing units (GPUs), traditional indexing approaches may not fully capitalize on these advancements. Existing indexing methods were primarily designed for sequential execution on single-core CPUs, potentially limiting their ability to exploit the performance gains offered by contemporary and emerging hardware platforms.
    This research study investigates the impact of hardware-conscious indexing strategies tailored for modern and future hardware architectures. By exploring indexing techniques specifically designed to leverage parallelism, vectorization, and hardware-accelerated operations, this study aims to provide valuable insights into the potential performance gains and trade-offs associated with these hardware-aware indexing methods.
    Our investigation encompasses a wide range of indexing strategies, including the following:
    • Traditional B-Tree and Hash indexes, serving as a baseline for comparison;
    • Cache-conscious B-Tree variants optimized for modern CPU cache hierarchies;
    • SIMD-optimized Hash indexes leveraging vector processing capabilities;
    • GPU-accelerated spatial indexing techniques for specialized query workloads;
    • Machine learning-based approaches, including reinforcement learning for index selection and neural network-based index advisors.
    Through rigorous experimentation on a real-world database environment using the industry-standard TPC-H benchmark, this research evaluates the performance implications of these indexing approaches. Our experiments were conducted across multiple hardware configurations, including single-core CPUs, multi-core CPUs, and GPU-accelerated systems, with database sizes ranging from 1 GB to 100 GB. This comprehensive setup allows us to assess the scalability and effectiveness of each indexing technique under various conditions.
    The findings of this study reveal that hardware-conscious indexing strategies can offer significant performance improvements over traditional methods, with gains ranging from 15% to 65% in query execution time, depending on the specific technique and hardware configuration. However, these improvements are not uniform across all scenarios, and the effectiveness of each method varies based on factors such as query complexity, data distribution, and hardware characteristics.
    Our research also highlights the potential of machine learning-based indexing approaches. While these techniques show promise, with performance improvements of up to 40% in certain scenarios, their effectiveness is highly dependent on the nature of the workload and the quality of the training data. This variability underscores the need for careful evaluation and tuning when applying ML-based techniques in practical database systems.
    By offering a comprehensive analysis and practical recommendations, this research contributes to the ongoing pursuit of database performance optimization in the era of heterogeneous computing. The findings inform database administrators, developers, and system architects on effective indexing practices tailored for modern hardware, while also paving the way for future research into adaptive indexing techniques that can dynamically leverage hardware capabilities based on workload characteristics and resource availability.
    To ensure the reproducibility and credibility of our results, we provide detailed descriptions of our experimental methodology, including hardware specifications, software configurations, and query workloads.
    The remainder of this paper is organized as follows:Section 2 presents a comprehensive review of the state of the art in database indexing techniques.Section 3 details our experimental setup and methodology.Section 4 presents and analyzes the results of our experiments.Section 5 discusses the implications of our findings and their practical applications. Finally,Section 6 concludes the paper and outlines directions for future research.

    2. State of the Art

    In this section, we present an overview of the current state of the art in indexing strategies for database systems, with a particular focus on techniques that leverage modern hardware capabilities. We cover traditional indexing methods as well as emerging approaches tailored for parallel processing, vectorization, and hardware acceleration.

    2.1. Traditional Indexing Techniques

    Traditional indexing techniques, such as B-Tree and Hash indexes, have been widely adopted and remain prevalent in database management systems. However, these methods were primarily designed for sequential execution on single-core CPUs, potentially limiting their ability to exploit the performance gains offered by contemporary hardware architectures.

    2.1.1. B-Tree Indexes

    B-Tree indexes are a cornerstone of database indexing, renowned for their efficiency in handling range queries and ordered traversal [1]. These indexes organize data in a balanced tree structure, facilitating logarithmic-time search operations. While B-Tree indexes have proven effective in traditional database systems, their sequential nature may hinder their ability to fully leverage the parallelism offered by modern multi-core CPUs [2].

    2.1.2. Hash Indexes

    Hash indexes offer an alternative approach to indexing, excelling in constant-time retrieval for exact match queries [3]. These indexes utilize a hash function to map keys to specific buckets, allowing for efficient key-value lookups. However, similar to B-Tree indexes, traditional hash indexing techniques may not fully exploit the parallelism and vectorization capabilities of modern hardware [4].

    2.2. Hardware-Conscious Indexing Strategies

    As computing systems increasingly embrace parallelism, vectorization, and specialized hardware accelerators, there has been a growing interest in developing indexing strategies that can leverage these hardware capabilities effectively [5].

    2.2.1. Cache-Conscious B-Tree Variants

    Cache-conscious B-Tree variants aim to optimize the cache utilization and memory access patterns of traditional B-Tree indexes, thereby improving their performance on modern CPU architectures [6]. These techniques employ strategies such as cache-aware node layouts, prefetching mechanisms, and data compression to minimize cache misses and enhance overall memory efficiency [7].

    2.2.2. SIMD-Optimized Hash Indexes

    Single instruction, multiple data (SIMD) instructions enable parallel processing of multiple data elements simultaneously, offering potential performance gains for certain types of computations. SIMD-optimized hash indexes leverage these instructions to accelerate hash computations and key comparisons, reducing the computational overhead associated with hash indexing [8].

    2.2.3. GPU-Accelerated Indexing

    Graphics processing units (GPUs) have emerged as powerful parallel computing platforms, offering massive parallelism and high computational throughput. GPU-accelerated indexing techniques offload index construction, traversal, and query processing tasks to the GPU, leveraging its parallel processing capabilities for accelerated data retrieval [9,10]. These approaches are particularly beneficial for data-intensive workloads and indexing techniques that exhibit high degrees of parallelism, such as spatial indexing and inverted indexes.

    2.2.4. Hybrid CPU-GPU Indexing Strategies

    Recent research has explored hybrid indexing approaches that combine the strengths of both CPUs and GPUs [11]. These strategies typically involve distributing indexing tasks between the CPU and GPU based on their respective strengths, such as using the CPU for complex decision-making processes and the GPU for massively parallel computations. Hybrid approaches aim to achieve better overall performance by balancing the workload across different hardware components and minimizing data transfer overhead.

    2.2.5. Adaptive and Hybrid Indexing

    Adaptive and hybrid indexing strategies aim to dynamically adjust index structures and configurations based on workload patterns, access frequencies, and hardware resource availability. Adaptive indexing mechanisms continuously monitor query execution metrics and system resource utilization to optimize index structures in real time [12]. Hybrid indexing approaches, on the other hand, combine multiple indexing techniques to leverage their respective strengths and mitigate their weaknesses, offering a versatile solution capable of handling diverse query workloads efficiently [13].

    2.3. Machine Learning-Driven Indexing

    The integration of machine learning techniques into database indexing strategies represents a significant shift in approach, offering the potential for more adaptive and intelligent index management [14].

    2.3.1. Reinforcement Learning for Index Selection

    Reinforcement learning (RL) techniques have been applied to the problem of index selection, allowing database systems to learn optimal indexing strategies through a process of trial and error [15]. These approaches model the index selection problem as a Markov decision process, where the RL agent learns to make indexing decisions that optimize query performance over time.

    2.3.2. Neural Network-Based Index Advisors

    Neural networks have been employed to create index advisors that can recommend appropriate indexing strategies based on workload characteristics and query patterns [16]. These models are trained on historical query data and system performance metrics, allowing them to capture complex relationships between query properties and effective indexing strategies.

    2.3.3. Learned Index Structures

    Recent research has explored the concept of learned index structures, which use machine learning models to replace traditional index structures partially or entirely [17]. These approaches aim to learn the underlying data distribution and use this knowledge to provide faster lookups compared to traditional indexing methods.

    2.4. Challenges and Open Problems

    Despite the advancements in hardware-conscious and machine learning-driven indexing techniques, several challenges remain:
    • Balancing the trade-offs between index creation time, query performance, and storage overhead
    • Developing indexing strategies that can adapt to dynamic workloads and evolving hardware capabilities
    • Ensuring the robustness and reliability of machine learning-based indexing approaches in production environments
    • Addressing the increased complexity and potential lack of interpretability in advanced indexing techniques
    • Optimizing data movement and minimizing communication overhead in distributed and heterogeneous computing environments
    These challenges present opportunities for future research and innovation in the field of database indexing.
    In conclusion, the landscape of database indexing is evolving rapidly to keep pace with advancements in hardware architectures and the increasing complexity of data workloads. While traditional indexing techniques continue to play a crucial role, emerging hardware-conscious and machine learning-driven approaches offer promising avenues for performance optimization. Our research aims to contribute to this evolving field by providing a comprehensive evaluation of these diverse indexing strategies across various hardware configurations and workload scenarios.

    3. Experimental Setup

    This section details our methodology, addressing configuration and implementation key aspects.

    3.1. Database Management System

    Our experiments were conducted using PostgreSQL 13.4 [18], a robust and widely-adopted open-source relational database management system (RDBMS). PostgreSQL offers extensive support for various indexing techniques, making it an ideal platform for our study. Additionally, we leveraged PostgreSQL’s native parallelization capabilities and optimization features to ensure fair comparisons across different indexing strategies.

    3.2. TPC-H Benchmark

    To generate realistic and industry-standard benchmark data, we utilized the TPC-H 3.0.1 (Transaction Processing Performance Council Benchmark H) benchmark [19]. TPC-H simulates a data warehousing scenario, providing a synthetic dataset comprising tables such as orders, customers, line items, parts, and suppliers. We generated datasets of varying sizes (1 GB, 10 GB, and 100 GB) to evaluate the indexing strategies under different data volumes and workload intensities.

    3.3. Hardware Configurations

    To investigate the impact of hardware architectures on indexing performance, we used three distinct platforms:
    • Single-core CPU: Intel Core i7-8700 CPU (single core enabled) running at 3.2 GHz with 16 GB of DDR4-2666 RAM and a 512 GB NVMe SSD.
    • Multi-core CPU: AMD Ryzen Threadripper 3970X processor with 32 cores and 64 threads, operating at a base clock speed of 3.7 GHz. The system was equipped with 128 GB of DDR4-3200 RAM and a 1 TB NVMe SSD.
    • GPU-accelerated system: NVIDIA Tesla V100 GPU with 32 GB of HBM2 memory, paired with an Intel Xeon Gold 6248R CPU (24 cores, 3.0 GHz base clock) and 256 GB of DDR4-2933 RAM.
    All systems ran Ubuntu 20.04 LTS with the same kernel version (5.4.0) to minimize OS-related variations. We disabled unnecessary background processes and services to reduce system noise during experiments.

    3.4. Indexing Techniques Implemented

    We implemented and evaluated the following indexing techniques:
    • Traditional B-Tree and Hash indexes (PostgreSQL native implementations);
    • Cache-conscious B-Tree variant (custom implementation based on [6]);
    • SIMD-optimized Hash index (custom implementation using Intel AVX-512 instructions);
    • GPU-accelerated R-Tree for spatial indexing (implemented using CUDA 11.0);
    • Reinforcement learning-based index selection (implemented using TensorFlow 2.4);
    • Neural network-based index advisor (implemented using PyTorch 1.8).

    3.5. Query Workload

    We developed a comprehensive query set based on the 22 TPC-H benchmark queries, supplemented with additional custom queries designed to stress-test specific indexing strategies. The query set encompassed the following:
    • Range queries (e.g., TPC-H Q6);
    • Exact match lookups (e.g., TPC-H Q4);
    • Join queries (e.g., TPC-H Q3, Q10);
    • Aggregation queries (e.g., TPC-H Q1);
    • Complex analytical queries (e.g., TPC-H Q18).
    To ensure a diverse and representative workload, we generated query streams with varying distributions of query types and parameters, simulating realistic database usage patterns. Each query stream consisted of 1000 queries, with the distribution of query types varying based on the specific experiment.

    3.6. Performance Metrics and Data Collection

    We collected the following performance metrics for each experiment:
    • Query execution time (milliseconds);
    • CPU utilization (percentage);
    • Memory usage (megabytes);
    • Disk I/O operations (reads/writes per second);
    • GPU utilization (percentage, for GPU-accelerated systems);
    • PCIe data transfer time (milliseconds, for GPU-accelerated systems).
    Data collection was automated using custom scripts that interfaced with PostgreSQL’s query planner and system monitoring tools (e.g., perf, nvidia-smi). Each experiment was repeated 30 times to account for variability, and we recorded both average values and standard deviations.

    3.7. Experimental Procedure

    Our experimental procedure consisted of the following steps:
    • For each hardware configuration and dataset size:
      (a)
      Load the TPC-H dataset into PostgreSQL;
      (b)
      Create indexes using the technique under evaluation;
      (c)
      Vacuum and analyze the database to update statistics.
    • For each query in the query set:
      (a)
      Clear database caches and buffers;
      (b)
      Execute the query and collect performance metrics;
      (c)
      Repeat 30 times.
    • For adaptive indexing techniques:
      (a)
      Train the model using a subset of the query workload (70% of queries);
      (b)
      Evaluate performance on a separate test set of queries (30% of queries).

    3.8. Control Measures

    To ensure fair comparisons and minimize confounding factors:
    • We disabled all non-essential background processes on the test systems.
    • The database configuration (e.g., buffer sizes, max connections) was standardized across all experiments, with settings optimized for each hardware configuration.
    • We used the same query optimizer settings for all experiments to isolate the impact of indexing strategies.
    • Environmental factors such as room temperature were monitored and kept consistent throughout the experiments.

    3.9. Data Analysis and Statistical Methods

    To ensure the validity and significance of our results:
    • We calculated mean values and standard deviations for all performance metrics across the 30 repetitions of each experiment (discarded the 10 best results and the 10 worst results).
    • We performed paired t-tests to assess the statistical significance of performance differences between indexing techniques, using a significance level ofα = 0.05.
    • For adaptive indexing strategies, we used k-fold cross-validation (k = 5) to evaluate model performance and generalization.
    • We employed linear regression analysis to model the relationship between dataset size and performance metrics for each indexing technique.
    • Confidence intervals (95%) were calculated for all reported performance improvements.

    4. Results

    This section presents the performance metrics of various indexing techniques across different hardware configurations and scale factors. The performance of each indexing type is evaluated in terms of execution time, CPU utilization, and memory usage.

    4.1. B-Tree Index Performance

    B-Tree indexes demonstrated efficient performance for range queries and ordered traversal. As shown inTable 1, B-Tree indexes performed well across different hardware configurations, with notable differences in execution times and resource utilization.
    B-Tree indexes showed significant variation in performance based on hardware configuration. Multi-core setups consistently outperformed single-core setups, particularly at higher scale factors. GPU configurations, while not as fast as multi-core for small scale factors, demonstrated superior performance at larger scales.

    4.2. Hash Index Performance

    Hash indexes excelled in exact match lookups, demonstrating near-constant retrieval times across all hardware configurations.Table 2 presents the full performance metrics for Hash indexes.
    Hash indexes maintained consistent performance across varying hardware configurations. Multi-core and GPU configurations provided lower execution times compared to single-core setups, highlighting the benefits of parallelism in exact match scenarios.

    4.3. Cache-Conscious B-Tree Index Performance

    Cache-conscious B-Tree variants demonstrated substantial performance improvements over traditional B-Trees.Table 3 shows the full performance metrics for these variants.
    Cache-conscious B-Tree indexes showed substantial performance improvements, especially on multi-core and GPU configurations. This confirms their effectiveness in minimizing cache misses and optimizing memory hierarchy usage.

    SIMD-Optimized Hash Indexes

    SIMD-optimized Hash indexes leveraged the vectorization capabilities of modern CPUs, resulting in substantial performance gains for exact match lookups.Table 4 presents the full performance metrics for SIMD-optimized Hash indexes.
    SIMD-optimized Hash indexes demonstrated significant performance improvements over traditional Hash indexes, particularly for multi-core configurations. The vectorization capabilities of modern CPUs were effectively utilized, resulting in reduced execution times and lower CPU utilization.

    4.4. GPU-Based Indexing Techniques

    GPU-Accelerated Spatial Indexing

    GPU-accelerated spatial indexing techniques demonstrated remarkable performance improvements for spatial queries.Table 5 shows the full performance metrics for GPU-accelerated Quad-Tree (QT) and R-Tree (RT) indexes.
    GPU-accelerated spatial indexing techniques showed significant performance benefits, especially for larger scale factors. Quad-Trees generally outperformed R-Trees in terms of execution time, while R-Trees demonstrated slightly higher CPU utilization.

    4.5. Summary of Key Findings

    The results highlight the strengths and weaknesses of each indexing technique under various hardware conditions:
    • B-Tree indexes are versatile and perform well in range queries but require careful tuning for large datasets.
    • Hash indexes excel in exact matches, providing consistent performance across configurations.
    • Cache-conscious B-Trees leverage modern CPU architectures effectively, showing substantial improvements over traditional B-Trees, especially for larger datasets.
    • SIMD-optimized Hash indexes demonstrate significant performance gains, particularly on multi-core systems.
    • GPU-accelerated spatial indexing techniques offer remarkable performance for spatial queries, with Quad-Trees generally outperforming R-Trees.
    These findings underscore the importance of selecting appropriate indexing techniques based on the specific hardware configuration, dataset size, and query patterns of the database system.

    5. Results and Analysis

    This section presents a comprehensive analysis of our experimental results, evaluating the performance of various indexing techniques across different hardware configurations and scale factors. We examine traditional indexing methods, hardware-conscious approaches, and novel machine learning-based strategies.

    5.1. Performance Evaluation Metrics

    To ensure a thorough assessment of each indexing technique, we employed the following metrics, each formulated mathematically:
    • Execution Time (ms): Measures the total time taken to execute a query.
      Texec=tendtstart
      wheretstart is the time at the beginning of the query execution andtend is the time at the end of the query execution.
    • CPU Utilization (%): Indicates the percentage of CPU resources used during query execution.
      CPUutil=CactiveCtotal×100
      whereCactive is the active CPU time andCtotal is the total CPU time available.
    • Memory Usage (MB): Represents the amount of memory consumed by the indexing structure and query processing.
      Musage=MendMstart
      whereMstart is the memory usage at the beginning andMend is the memory usage at the end of the query processing.
    • Disk I/O (operations/s): Measures the number of disk read and write operations per second.
      DiskI/O=Rops+Wopstduration
      whereRops andWops are the read and write operations, respectively, andtduration is the time duration of the measurement.
    • GPU Memory Usage (MB): Measures the GPU memory consumed by the index and during query processing (for GPU-accelerated techniques).
      GMusage=GMendGMstart
      whereGMstart is the GPU memory usage at the beginning andGMend is the GPU memory usage at the end of the query processing.
    • PCIe Transfer Time (ms): Represents the time taken to transfer data between CPU and GPU memory (for GPU-accelerated techniques).
      TPCIe=tPCIeendtPCIestart
      wheretPCIestart is the start time of the PCIe transfer andtPCIeend is the end time of the PCIe transfer.

    5.2. Traditional Indexing Techniques

    5.2.1. B-Tree Indexes

    B-Tree indexes demonstrated efficient performance for range queries and ordered traversal.Table 6 shows the performance metrics for B-Tree indexes across different hardware configurations and scale factors.
    The B-Tree index implementation in PostgreSQL is based on the following structure (Listing 1):
    Listing 1. B-Tree node structure in PostgreSQL.
    Information 15 00429 i001
    This structure represents the core of PostgreSQL’s B-Tree implementation, which serves as our baseline for performance comparisons.

    5.2.2. Hash Indexes

    Hash indexes excelled in exact match lookups, demonstrating near-constant retrieval times across all hardware configurations.Table 7 presents the performance metrics for Hash indexes.
    The Hash index implementation in PostgreSQL uses the following key structure (Listing 2):
    Listing 2. Hash index structures in PostgreSQL.
    Information 15 00429 i002
    This structure represents the metadata for PostgreSQL’s Hash index implementation, which serves as our baseline for performance comparisons.

    5.3. Hardware-Conscious Indexing Techniques

    5.3.1. Cache-Conscious B-Tree Variants

    Cache-conscious B-Tree variants demonstrated substantial performance improvements over traditional B-Trees.Table 8 shows the performance metrics for these variants.
    Our implementation of the cache-conscious B-Tree variant is based on the following structure (Listing 3):
    Listing 3. Cache-conscious B-Tree node structure.
    Information 15 00429 i003
    This structure is designed to align with CPU cache line sizes, improving memory access patterns and reducing cache misses.

    5.3.2. SIMD-Optimized Hash Indexes

    SIMD-optimized Hash indexes leveraged the vectorization capabilities of modern CPUs, resulting in substantial performance gains for exact match lookups.Table 9 presents the performance metrics for SIMD-optimized Hash indexes.
    Our SIMD-optimized Hash index implementation utilizes AVX-512 instructions for parallel hash computations (Listing 4):
    Listing 4. SIMD-optimized hash computation.
    Information 15 00429 i004
    This implementation allows for the computing of hash values for 16 keys simultaneously, significantly accelerating the hash index operations.

    5.4. GPU-Based Indexing Techniques

    GPU-Accelerated Spatial Indexing

    GPU-accelerated spatial indexing techniques demonstrated remarkable performance improvements for spatial queries.Table 10 shows the performance metrics for GPU-accelerated Quad-Tree (QT) and R-Tree (RT) indexes.
    Our GPU-accelerated R-Tree implementation uses CUDA for parallel node traversal (Listing 5):
    Listing 5. GPU-accelerated R-Tree traversal.
    Information 15 00429 i005
    This CUDA kernel enables parallel traversal of the R-Tree structure on the GPU, significantly accelerating spatial query processing.

    5.5. Machine Learning-Based Indexing Techniques

    5.5.1. Reinforcement Learning-Based Index Selection

    Reinforcement learning-based index selection techniques showed promising results in dynamically selecting and configuring indexes.Table 11 presents the performance metrics for this approach.
    Our implementation of the reinforcement learning-based index selection uses TensorFlow for the RL agent (Listing 6):
    Listing 6. RL-based index selection agent.
    Information 15 00429 i006
    This RL agent learns to select optimal indexing strategies based on the current database state and query workload.

    5.5.2. Neural Network-Based Index Advisors

    Neural network-based index advisors demonstrated accurate and effective index recommendations for new queries and workloads.Table 12 shows the performance metrics for this approach.
    Our Neural network-based index advisor implementation uses PyTorch (Listing 7):
    Listing 7. NN-based index advisor.
    Information 15 00429 i007
    This neural network learns to recommend indexing strategies based on query features and historical performance data.

    5.6. Summary of Key Findings

    To highlight the main findings of our study, we present summary tables comparing the performance of different indexing techniques across hardware configurations and scale factors.

    5.7. Analysis and Discussion

    Our experimental results reveal several key insights:
    • Hardware-conscious techniques outperform traditional indexes: Cache-conscious B-Tree variants and SIMD-optimized hash indexes consistently outperform their traditional counterparts across all hardware configurations. For example, the cache-conscious B-Tree achieves a 34.2% reduction in execution time compared to the traditional B-Tree on multi-core systems (Table 13).
    • GPU acceleration benefits vary: GPU acceleration shows the most significant benefits for specialized indexing techniques like spatial indexing, with up to 37.8% reduction in execution time for Quad-Trees compared to CPU-based implementations (Table 10). However, its advantages for traditional indexes are more modest, likely due to data transfer overheads.
    • Machine learning approaches show promise: Both reinforcement learning-based and neural network-based indexing techniques demonstrate significant performance improvements over traditional methods. The NN-based approach, in particular, achieves a 48.6% reduction in execution time compared to traditional B-Trees on multi-core systems (Table 13). This suggests that ML-based techniques can effectively adapt to diverse query workloads and hardware configurations.
    • Scalability across dataset sizes: Our experiments across different scale factors (1 GB, 10 GB, 100 GB) reveal that the performance benefits of advanced indexing techniques generally increase with dataset size. This scalability is particularly evident for cache-conscious and ML-based approaches, which show improved relative performance as data volumes grow.
    • Trade-offs between performance and resource utilization: While advanced indexing techniques offer significant performance improvements, they often come at the cost of increased implementation complexity and, in some cases, higher memory usage. Database administrators and developers must carefully consider these trade-offs when selecting indexing strategies for specific use cases,Table 14.
    These findings, presented inTable 13, underscore the importance of tailoring indexing strategies to specific hardware configurations and workload characteristics. The significant performance gains observed with hardware-conscious and ML-based techniques highlight the potential for substantial query optimization in modern database systems. However, the variability in performance improvements across different scenarios emphasizes the need for careful evaluation and tuning when implementing these advanced indexing strategies in production environments.

    6. Discussion

    This section examines the broader implications of our experimental results, addresses limitations of our study, and proposes directions for future research in database indexing strategies.

    6.1. Implications of Hardware-Conscious Indexing

    Our experiments demonstrate that hardware-conscious indexing techniques consistently outperform traditional methods across various hardware configurations. Key implications include:
    • Cache Optimization: Cache-conscious B-Trees achieved a 34.2% reduction in execution time compared to traditional B-Trees, highlighting the critical role of cache utilization in modern database systems.
    • Vectorization Benefits: SIMD-optimized hash indexes showed a 32.4% reduction in execution time for exact match queries, emphasizing the potential of vector processing in database operations.
    • Hardware-Software Co-design: The varying performance improvements across hardware configurations underscore the importance of co-designing database algorithms and hardware architectures.

    6.2. GPU Acceleration: Opportunities and Challenges

    GPU-accelerated indexing techniques revealed both promising opportunities and notable challenges:
    • Specialized Workloads: GPU acceleration showed up to 37.8% reduction in execution time for spatial indexing (Quad-Trees), indicating their suitability for data-parallel operations and complex geometric computations.
    • Data Transfer Overhead: The modest gains for traditional indexes on GPU systems highlight the impact of data transfer costs between CPU and GPU memory.
    • Hybrid Approaches: Future research should explore dynamic allocation of indexing tasks between CPUs and GPUs based on workload characteristics and resource availability.

    6.3. Machine Learning-Based Indexing: Promise and Challenges

    ML-based indexing techniques demonstrated significant potential, with up to 48.6% reduction in execution time. However, several challenges remain:
    • Training Data Quality: The effectiveness of ML-based techniques heavily depends on the quality and representativeness of training data.
    • Model Interpretability: Developing interpretable ML models for index selection could enhance trust and adoption in production environments.
    • Online Learning: Future research should explore online learning techniques for continuous adaptation to changing workload patterns.

    6.4. Limitations and Future Work

    We acknowledge several limitations in our study:
    • Workload Diversity: Our experiments may not fully capture the diversity of real-world database workloads.
    • Hardware Configurations: The study was conducted on a limited set of hardware configurations.
    • Concurrent Workloads: Our experiments focused primarily on single-query performance.
    • Index Maintenance Costs: A more comprehensive study of index creation, updates, and maintenance costs is warranted.
    Based on these limitations and our findings, we propose the following directions for future research:
    • Adaptive Hybrid Indexing: Developing strategies that dynamically switch between different indexing techniques based on query patterns and hardware resources.
    • Hardware-Aware Query Optimization: Integrating hardware-conscious indexing techniques into query optimizers.
    • Explainable ML-based Indexing: Investigating techniques for making ML-based indexing decisions more interpretable.
    • Energy-Efficient Indexing: Exploring energy consumption implications and develop energy-aware indexing strategies.
    • Indexing for Emerging Hardware: Investigating techniques optimized for emerging technologies such as non-volatile memory and domain-specific accelerators.
    In conclusion, our study demonstrates the significant potential of hardware-conscious and ML-based indexing techniques to improve database query performance. Realizing these benefits in practice requires careful consideration of workload characteristics, hardware configurations, and implementation complexities. As database systems evolve in the era of heterogeneous computing, the development of adaptive, hardware-aware indexing strategies remains a crucial area for ongoing research and innovation.

    Author Contributions

    Writing—original draft, M.A. and M.V.B.; Supervision, P.M.; Funding acquisition, P.V. and J.S. All authors have read and agreed to the published version of the manuscript.

    Funding

    This work is funded by National Funds through the FCT—Foundation for Science and Technology, I.P., within the scope of the project Ref. UIDB/05583/2020. Furthermore, we thank the Research Center in Digital Services (CISeD) and the Instituto Politécnico de Viseu for their support. Maryam Abbasi thanks the national funding by FCT—Foundation for Science and Technology, I.P., through the institutional scientific employment program contract (CEECINST/00077/2021). This work is also supported by FCT/MCTES through national funds and, when applicable, co-funded EU funds under the project UIDB/50008/2020, and DOI identifier10.54499/UIDB/50008/2020.

    Data Availability Statement

    Data is contained within the article.

    Conflicts of Interest

    The authors declare no conflict of interest.

    References

    1. Xu, H.; Li, A.; Wheatman, B.; Marneni, M.; Pandey, P. BP-tree: Overcoming the Point-Range Operation Tradeoff for In-Memory B-trees.Proc. VLDB Endow.2023,16, 2976–2989. [Google Scholar] [CrossRef]
    2. Chakraoui, M.; Kalay, A.; Marrakech, T. Optimization of local parallel index (LPI) in parallel/distributed database systems.Int. J. Geomate2016,11, 2755–2762. [Google Scholar] [CrossRef]
    3. Shahrokhi, H.; Shaikhha, A. An Efficient Vectorized Hash Table for Batch Computations. In37th European Conference on Object-Oriented Programming (ECOOP 2023); Schloss-Dagstuhl-Leibniz Zentrum für Informatik: Wadern, Germany, 2023; pp. 27:1–27:27. [Google Scholar] [CrossRef]
    4. Wang, J.; Liu, W.; Kumar, S.; Chang, S.F. Learning to hash for indexing big data—A survey.Proc. IEEE2015,104, 34–57. [Google Scholar] [CrossRef]
    5. Xin, G.; Zhao, Y.; Han, J. A Multi-Layer Parallel Hardware Architecture for Homomorphic Computation in Machine Learning. In Proceedings of the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea, 22–28 May 2021; pp. 1–5. [Google Scholar] [CrossRef]
    6. Singh, A.; Alankar, B. An overview of b+ tree performance.Int. J. Adv. Res. Comput. Sci.2017,8, 1856–1857. [Google Scholar] [CrossRef]
    7. Tripathy, S.; Satpathy, M. SSD internal cache management policies: A survey.J. Syst. Archit.2022,122, 102334. [Google Scholar] [CrossRef]
    8. Tan, L.; Wang, Y.; Yi, J.; Yang, F. Single-Instruction-Multiple-Data Instruction-Set-Based Heat Ranking Optimization for Massive Network Flow.Electronics2023,12, 5026. [Google Scholar] [CrossRef]
    9. Tran, B.; Schaffner, B.; Myre, J.; Sawin, J.; Chiu, D. Exploring Means to Enhance the Efficiency of GPU Bitmap Index Query Processing.Data Sci. Eng.2020,6, 209–228. [Google Scholar] [CrossRef]
    10. Kouassi, E.K.; Amagasa, T.; Kitagawa, H. Efficient Probabilistic Latent Semantic Indexing using Graphics Processing Unit.Procedia Comput. Sci.2011,4, 382–391. [Google Scholar] [CrossRef]
    11. Gowanlock, M.; Rude, C.; Blair, D.M.; Li, J.D.; Pankratius, V. A Hybrid Approach for Optimizing Parallel Clustering Throughput using the GPU.IEEE Trans. Parallel Distrib. Syst.2019,30, 766–777. [Google Scholar] [CrossRef]
    12. Anneser, C.; Kipf, A.; Zhang, H.; Neumann, T.; Kemper, A. Adaptive Hybrid Indexes. In Proceedings of the 2022 International Conference on Management of Data, Philadelphia, PA, USA, 12–17 June 2022. [Google Scholar] [CrossRef]
    13. Sun, Y.; Zhao, T.; Yoon, S.; Lee, Y. A Hybrid Approach Combining R*-Tree and k-d Trees to Improve Linked Open Data Query Performance.Appl. Sci.2021,11, 2405. [Google Scholar] [CrossRef]
    14. Kraska, T. Towards instance-optimized data systems.Proc. VLDB Endow.2021,14, 3222–3232. [Google Scholar] [CrossRef]
    15. Sadri, Z.; Gruenwald, L.; Leal, E. Online Index Selection Using Deep Reinforcement Learning for a Cluster Database. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW), Dallas, TX, USA, 20–24 April 2020; pp. 158–161. [Google Scholar] [CrossRef]
    16. Tan, Y.K.; Xu, X.; Liu, Y. Improved Recurrent Neural Networks for Session-based Recommendations. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA, 15 September 2016. [Google Scholar] [CrossRef]
    17. Marcus, R.; Kipf, A.; van Renen, A.; Stoian, M.; Misra, S.; Kemper, A.; Neumann, T.; Kraska, T. Benchmarking learned indexes.Proc. VLDB Endow.2020,14, 1–13. [Google Scholar] [CrossRef]
    18. Schab, S. The comparative performance analysis of selected relational database systems.J. Comput. Sci. Inst.2023,28, 296–303. [Google Scholar] [CrossRef]
    19. Nambiar, R.; Wakou, N.; Carman, F.; Majdalany, M. Transaction processing performance council (TPC): State of the council 2010. In Proceedings of the Performance Evaluation, Measurement and Characterization of Complex Systems: Second TPC Technology Conference, TPCTC 2010, Singapore, 13–17 September 2010; Revised Selected Papers 2. Springer: New York, NY, USA, 2011; pp. 1–9. [Google Scholar]
    Table 1. Performance metrics for B-Tree index.
    Table 1. Performance metrics for B-Tree index.
    Hardware
    Config
    Query
    No
    Scale
    Factor
    Execution
    Time (ms)
    CPU
    Utilization (%)
    Memory
    Usage (MB)
    single_core11157.83 ± 8.2486 ± 3289 ± 12
    single_core1101844.27 ± 45.6993 ± 21137 ± 31
    single_core110019,374.63 ± 328.9197 ± 18219 ± 157
    single_core31312.68 ± 11.7589 ± 2547 ± 18
    single_core3103406.91 ± 79.3295 ± 12518 ± 47
    single_core310035,874.29 ± 563.1898 ± 117,632 ± 289
    multi_core11102.45 ± 5.3153 ± 3523 ± 19
    multi_core1101189.73 ± 29.8464 ± 22309 ± 53
    multi_core11005891.58 ± 193.7276 ± 29368 ± 178
    multi_core31201.35 ± 9.2759 ± 31085 ± 28
    multi_core3102450.84 ± 57.9173 ± 24627 ± 89
    multi_core310026,088.17 ± 379.4685 ± 235,127 ± 412
    gpu11129.66 ± 6.7326 ± 21153 ± 34
    gpu1101527.94 ± 38.2133 ± 24518 ± 87
    gpu11007239.42 ± 246.1839 ± 212,947 ± 231
    gpu31263.74 ± 10.5830 ± 22284 ± 51
    gpu3102986.31 ± 69.7538 ± 29174 ± 163
    gpu310030,383.95 ± 487.2947 ± 269,584 ± 578
    Table 2. Performance metrics for Hash index.
    Table 2. Performance metrics for Hash index.
    Hardware
    Config
    Query
    No
    Scale
    Factor
    Execution
    Time (ms)
    CPU
    Utilization (%)
    Memory
    Usage (MB)
    single_core6145.37 ± 2.4161 ± 3157 ± 8
    single_core610526.84 ± 16.3969 ± 2618 ± 21
    single_core61005049.31 ± 107.4672 ± 24537 ± 93
    single_core141141.85 ± 6.5276 ± 3301 ± 13
    single_core14101495.37 ± 34.8184 ± 21247 ± 37
    single_core1410015,892.64 ± 289.7589 ± 29734 ± 186
    multi_core6124.18 ± 1.3241 ± 2293 ± 11
    multi_core610276.95 ± 10.4749 ± 21226 ± 34
    multi_core61001819.46 ± 73.2552 ± 25934 ± 127
    multi_core14169.24 ± 3.6547 ± 2578 ± 19
    multi_core1410789.53 ± 23.1660 ± 22451 ± 58
    multi_core141008472.91 ± 173.8566 ± 219,127 ± 274
    gpu6135.64 ± 1.8316 ± 1295 ± 12
    gpu610403.75 ± 12.6923 ± 21237 ± 36
    gpu61002124.57 ± 86.3127 ± 27063 ± 153
    gpu14192.71 ± 4.7621 ± 2581 ± 20
    gpu14101052.36 ± 28.7428 ± 22465 ± 61
    gpu1410011,126.95 ± 217.3834 ± 219,463 ± 289
    Table 3. Performance metrics for cache-conscious B-Tree index.
    Table 3. Performance metrics for cache-conscious B-Tree index.
    Hardware
    Config
    Query
    No
    Scale
    Factor
    Execution
    Time (ms)
    CPU
    Utilization (%)
    Memory
    Usage (MB)
    single_core11136.82 ± 7.5184 ± 3281 ± 11
    single_core1101719.46 ± 41.8791 ± 21108 ± 29
    single_core110018,492.63 ± 321.7195 ± 18173 ± 157
    single_core31291.85 ± 10.9487 ± 2543 ± 17
    single_core3103267.38 ± 75.1994 ± 12486 ± 45
    single_core310034,576.28 ± 537.6497 ± 117,392 ± 276
    multi_core1193.14 ± 4.2749 ± 3508 ± 18
    multi_core1101078.65 ± 26.9361 ± 22246 ± 49
    multi_core11005386.58 ± 189.3272 ± 29253 ± 178
    multi_core31184.92 ± 8.4356 ± 31057 ± 26
    multi_core3102145.79 ± 53.6471 ± 24518 ± 84
    multi_core310024,163.75 ± 351.2882 ± 234,286 ± 389
    gpu11120.73 ± 6.1825 ± 21121 ± 31
    gpu1101416.85 ± 35.4231 ± 24397 ± 82
    gpu11006612.42 ± 231.8437 ± 212,814 ± 231
    gpu31246.38 ± 9.8529 ± 22218 ± 48
    gpu3102791.47 ± 65.2836 ± 28924 ± 157
    gpu310029,243.76 ± 459.8145 ± 267,685 ± 563
    Table 4. Performance metrics for SIMD-optimized Hash index.
    Table 4. Performance metrics for SIMD-optimized Hash index.
    Hardware
    Config
    Query
    No
    Scale
    Factor
    Execution
    Time (ms)
    CPU
    Utilization (%)
    Memory
    Usage (MB)
    single_core6133.15 ± 1.7455 ± 2150 ± 6
    single_core610394.75 ± 12.6363 ± 2596 ± 15
    single_core61003914.53 ± 85.2767 ± 24518 ± 87
    single_core14187.72 ± 3.5165 ± 2287 ± 9
    single_core14101003.22 ± 27.4674 ± 21193 ± 28
    single_core1410010,416.34 ± 190.3180 ± 29401 ± 163
    multi_core6117.41 ± 0.8935 ± 1281 ± 8
    multi_core610204.86 ± 7.6343 ± 11185 ± 26
    multi_core61002076.31 ± 51.8247 ± 16936 ± 152
    multi_core14145.48 ± 2.3140 ± 1553 ± 14
    multi_core1410522.18 ± 15.6748 ± 12305 ± 43
    multi_core141005447.25 ± 113.6854 ± 118,195 ± 276
    gpu6123.09 ± 1.2210 ± 1284 ± 9
    gpu610272.82 ± 10.2117 ± 11199 ± 27
    gpu61001487.94 ± 74.3122 ± 16937 ± 167
    gpu14156.45 ± 2.6315 ± 1555 ± 15
    gpu1410650.28 ± 18.3422 ± 12281 ± 45
    gpu141006703.30 ± 144.9128 ± 118,530 ± 289
    Table 5. Performance metrics for GPU Quadtree (QT) and R-Tree (RT) indices.
    Table 5. Performance metrics for GPU Quadtree (QT) and R-Tree (RT) indices.
    Index
    Type
    Hardware
    Config
    Query
    No
    Scale
    Factor
    Execution
    Time (ms)
    CPU
    Utilization (%)
    Memory
    Usage (MB)
    QTgpu18145.40 ± 2.2930 ± 1555 ± 16
    QTgpu1810516.52 ± 15.4936 ± 12386 ± 47
    QTgpu181003201.70 ± 107.5341 ± 13648 ± 283
    QTgpu19167.38 ± 3.4725 ± 11120 ± 28
    QTgpu1910767.84 ± 21.3733 ± 14596 ± 82
    QTgpu191008112.15 ± 168.7939 ± 136,657 ± 476
    RTgpu18158.51 ± 2.6435 ± 1555 ± 16
    RTgpu1810649.50 ± 17.8641 ± 12322 ± 45
    RTgpu181003978.41 ± 132.9146 ± 14408 ± 283
    RTgpu19178.82 ± 3.8930 ± 11120 ± 28
    RTgpu1910897.73 ± 24.1438 ± 14596 ± 82
    RTgpu191009288.28 ± 192.2244 ± 136,656 ± 476
    Table 6. Performance metrics for B-tree index (Query 1, scale factor 100).
    Table 6. Performance metrics for B-tree index (Query 1, scale factor 100).
    Hardware
    Config
    Execution
    Time (ms)
    CPU Util.
    (%)
    Memory
    Usage (MB)
    Disk I/O
    (ops/s)
    Single-core19,374.63 ± 328.919782191247 ± 42
    Multi-core5891.58 ± 193.727693683856 ± 128
    GPU7239.42 ± 246.183912,9472973 ± 95
    Table 7. Performance metrics for Hash index (Query 6, Scale Factor 100).
    Table 7. Performance metrics for Hash index (Query 6, Scale Factor 100).
    Hardware
    Config
    Execution
    Time (ms)
    CPU Util.
    (%)
    Memory
    Usage (MB)
    Disk I/O
    (ops/s)
    Single-core5279.31 ± 107.46724537832 ± 28
    Multi-core1819.46 ± 73.255259342564 ± 86
    GPU2124.57 ± 86.312770631973 ± 64
    Table 8. Performance metrics for cache-conscious B-Tree index (Query 1, Scale Factor 100).
    Table 8. Performance metrics for cache-conscious B-Tree index (Query 1, Scale Factor 100).
    Hardware
    Config
    Execution
    Time (ms)
    CPU Util.
    (%)
    Memory
    Usage (MB)
    Disk I/O
    (ops/s)
    Single-core13,692.84 ± 246.75944371986 ± 33
    Multi-core3861.23 ± 142.685759353024 ± 104
    GPU4587.53 ± 179.463170732418 ± 78
    Table 9. Performance metrics for SIMD Hash index (Query 6, Scale Factor 100).
    Table 9. Performance metrics for SIMD Hash index (Query 6, Scale Factor 100).
    Hardware
    Config
    Execution
    Time (ms)
    CPU Util.
    (%)
    Memory
    Usage (MB)
    Disk I/O
    (ops/s)
    Single-core3914.53 ± 85.27674518724 ± 24
    Multi-core1276.31 ± 51.824757632236 ± 75
    GPU1487.94 ± 74.312269371724 ± 56
    Table 10. Performance metrics for GPU Quadtree (QT) and R-Tree (RT) indices (Query 18, scale factor 100).
    Table 10. Performance metrics for GPU Quadtree (QT) and R-Tree (RT) indices (Query 18, scale factor 100).
    Index
    Type
    Execution
    Time (ms)
    CPU Util.
    (%)
    GPU Mem
    Usage (MB)
    PCIe
    Time (ms)
    Disk I/O
    (ops/s)
    QT3201.70 ± 107.53413648342.81 ± 18.361524 ± 51
    RT3978.41 ± 132.91464408387.53 ± 20.741738 ± 58
    Table 11. Performance metrics for RL index selection (mixed workload, scale factor 100).
    Table 11. Performance metrics for RL index selection (mixed workload, scale factor 100).
    Hardware
    Config
    Execution
    Time (ms)
    CPU
    Util. (%)
    Memory
    Usage (MB)
    Disk I/O
    (ops/s)
    Single-core11,287.65 ± 210.55924171957 ± 32
    Multi-core3370.01 ± 106.225359352964 ± 99
    GPU3909.25 ± 150.723770732298 ± 74
    Table 12. Performance metrics for NN index advisor (mixed workload, scale factor 100).
    Table 12. Performance metrics for NN index advisor (mixed workload, scale factor 100).
    Hardware
    Config
    Execution
    Time (ms)
    CPU
    Util. (%)
    Memory
    Usage (MB)
    Disk I/O
    (ops/s)
    Single-core10,270.31 ± 179.24824427912 ± 30
    Multi-core3030.85 ± 95.735759202820 ± 94
    GPU3536.27 ± 106.813271052185 ± 70
    Table 13. Performance improvement of advanced techniques over traditional B-Tree (Query 1, scale factor 100).
    Table 13. Performance improvement of advanced techniques over traditional B-Tree (Query 1, scale factor 100).
    Index TypeExecution Time
    Improvement
    CPU Utilization
    Reduction
    Memory Usage
    Reduction
    Cache-conscious B-Tree (multi-core)34.2%25.0%36.6%
    RL-based (multi-core)42.8%30.3%36.6%
    NN-based (multi-core)48.6%25.0%36.8%
    Table 14. Performance comparison of indexing techniques (Query 1, scale factor 100).
    Table 14. Performance comparison of indexing techniques (Query 1, scale factor 100).
    Index TypeExecution
    Time (ms)
    CPU
    Utilization (%)
    Memory
    Usage (MB)
    B-Tree (single-core)19,374.63 ± 328.91978219
    B-Tree (multi-core)5891.58 ± 193.72769368
    Cache-conscious B-Tree (multi-core)3861.23 ± 142.68575935
    RL-based (multi-core)3370.01 ± 106.22535935
    NN-based (multi-core)3030.85 ± 95.73575920
    Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

    © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

    Share and Cite

    MDPI and ACS Style

    Abbasi, M.; Bernardo, M.V.; Váz, P.; Silva, J.; Martins, P. Revisiting Database Indexing for Parallel and Accelerated Computing: A Comprehensive Study and Novel Approaches.Information2024,15, 429. https://doi.org/10.3390/info15080429

    AMA Style

    Abbasi M, Bernardo MV, Váz P, Silva J, Martins P. Revisiting Database Indexing for Parallel and Accelerated Computing: A Comprehensive Study and Novel Approaches.Information. 2024; 15(8):429. https://doi.org/10.3390/info15080429

    Chicago/Turabian Style

    Abbasi, Maryam, Marco V. Bernardo, Paulo Váz, José Silva, and Pedro Martins. 2024. "Revisiting Database Indexing for Parallel and Accelerated Computing: A Comprehensive Study and Novel Approaches"Information 15, no. 8: 429. https://doi.org/10.3390/info15080429

    APA Style

    Abbasi, M., Bernardo, M. V., Váz, P., Silva, J., & Martins, P. (2024). Revisiting Database Indexing for Parallel and Accelerated Computing: A Comprehensive Study and Novel Approaches.Information,15(8), 429. https://doi.org/10.3390/info15080429

    Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further detailshere.

    Article Metrics

    No
    No

    Article Access Statistics

    For more information on the journal statistics, clickhere.
    Multiple requests from the same IP address are counted as one view.
    Information, EISSN 2078-2489, Published by MDPI
    RSSContent Alert

    Further Information

    Article Processing Charges Pay an Invoice Open Access Policy Contact MDPI Jobs at MDPI

    Guidelines

    For Authors For Reviewers For Editors For Librarians For Publishers For Societies For Conference Organizers

    MDPI Initiatives

    Sciforum MDPI Books Preprints.org Scilit SciProfiles Encyclopedia JAMS Proceedings Series

    Follow MDPI

    LinkedIn Facebook X
    MDPI

    Subscribe to receive issue release notifications and newsletters from MDPI journals

    © 1996-2025 MDPI (Basel, Switzerland) unless otherwise stated
    Terms and Conditions Privacy Policy
    We use cookies on our website to ensure you get the best experience.
    Read more about our cookieshere.
    Accept
    Back to TopTop
    [8]ページ先頭

    ©2009-2025 Movatter.jp