专题:Parallel Computing and Optimization Techniques

This cluster of papers focuses on parallel computing, performance optimization, and various aspects of multicore and heterogeneous computing. It covers topics such as GPU computing, memory systems, benchmarking, power management, simulation platforms, and high-performance computing.
最新文献
FFTArray: A Python library for the implementation of discretized multi-dimensional Fourier transforms

article Full Text OpenAlex

SEM-OS: Memory Operating System for Conversational LLMs

article Full Text OpenAlex

L15 Visibilite Totale - Complete Index Law for SEM-OS Memory Operating System

article Full Text OpenAlex

The Memory Processing Unit: A Generalized Interface for End-to-End In-Memory Execution

article Full Text OpenAlex

Enhancing Grid Resilience for Giga-Watt Scale Data Centers Using High Voltage Circuit Breaker Operated Braking Resistors

article Full Text OpenAlex

TEMP: A Memory Efficient Physical-Aware Tensor Partition-Mapping Framework on Wafer-Scale Chips

article Full Text OpenAlex

AQ-256 DETERMINISTIC SETTLEMENT INFRASTRUCTURE (DSI) // TRANSITION OVERLAY

article Full Text OpenAlex

RECORD 04: OPERATIONAL BASELINE // AQ-256 SUBSTRATE

article Full Text OpenAlex

A methodology for accurate benchmarking of neural network accelerators using a high-level synthesis-based hardware generator

article Full Text OpenAlex

Multi-GPU acceleration of PALABOS fluid solver using C++ standard parallelism

article Full Text OpenAlex

近5年高被引文献
Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct

book-chapter Full Text OpenAlex 12930 FWCI837.4225

UCSF ChimeraX: Tools for structure building and analysis

article Full Text OpenAlex 3592 FWCI793.1238

Algorithms+Data Structures = Programs

book-chapter Full Text OpenAlex 960 FWCI10.7435

IEEE Transactions on Parallel and Distributed Systems

paratext Full Text OpenAlex 581 FWCI0

PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation

article Full Text OpenAlex 500 FWCI250.5783

QLoRA: Efficient Finetuning of Quantized LLMs

preprint Full Text OpenAlex 484 FWCI0

TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings

article Full Text OpenAlex 396 FWCI181.9127

NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads

article Full Text OpenAlex 337 FWCI113.5169

PREDICTIVE PERFORMANCE AND SCALABILITY MODELING OF A LARGE-SCALE APPLICATION

article Full Text OpenAlex 312 FWCI0

IEEE Transactions on Dependable and Secure Computing

paratext Full Text OpenAlex 309 FWCI0