computer-architecture

DEV Community

nvidia-peermem "Invalid argument" on Ubuntu — Fix GPUDirect RDMA with DMA-BUF TL;DR: If modprobe nvidia-peermem fails with Invalid argument ( -EINVAL ) on a system using the inbox Ubuntu InfiniBand stack ( rdma-core ), the module is not broken and you do not need it. nvidia-peermem requires an API that only exists in MLNX_OFED. On Hopper/Blackwell GPUs with the NVIDIA open driver, use DMA-BUF ins…

computer-architectureengineering
Hacker News
Hey There Buddo
3d ago

I bought a cute little 4 bit cpu kit from Aliexpress called the TD4. It has 2 registers, some LEDs, and 16 bytes of program ROM. Quite limited but still very cool and teaches a lot of principles of computer architecture. The documentation, schematics, and pictures for this cpu are here https://github.com/wuxx/TD4-4BIT-CPU. It’s a little sparse though. I can imagine a student getting overwhelmed. …

computer-architecturecomputer-science
Hacker News

Occupancy Math on the AMD MI355X (CDNA4): A From-First-Principles Guide Ask a GPU kernel engineer how their kernel is doing and occupancy comes up within a sentence or two. It’s the number everyone quotes and the dial everyone reaches for — and, in my experience, the metric people understand least. Most treat it as an opaque percentage the profiler hands back. It isn’t. Occupancy is fully derivab…

computer-architectureengineering
SIGARCH
Shanqing Lin·...·Babak Falsafi
13d ago

The Return of Rigorous Full-System Timing Simulation Accurate timing simulation remains one of the most important tools in computer architecture, but modern systems have made cycle-level simulation increasingly impractical. Today’s platforms combine many-core CPUs, deep memory hierarchies, accelerators, complex I/O, and large software stacks, making detailed simulation extremely slow—often requir…

computer-architecturecomputer-science
DEV Community

If you've ever deployed memory-bound workloads on AWS Graviton, you know that CPU compute speed is only part of the story. Another factor in real-world performance is how efficiently your code accesses the memory subsystem, specifically the cache hierarchy, interconnects, and physical DRAM. In this article, I will walk through how to use the Arm System Characterization Tool (ASCT) to analyze the …

computer-architecturecomputer-science
DEV Community

Originally published on Alpinum Consulting The growth of open processor architectures has significantly increased the adoption of RISC‑V across embedded systems, AI accelerators, and high-performance computing platforms. This flexibility allows engineering teams to design processors with highly customised instruction sets and microarchitectures. However, this flexibility also increases verificati…

computer-architectureengineering
UC Davis Computer Architecture
Hacker News

This is the fifth installment of the 80386 series. The FPGA CPU is now far enough along to run real software, and this post is about how it works. z386 is a 386-class CPU built around the original Intel microcode, in the same spirit as z8086. The core is not an instruction-by-instruction emulator in RTL. The goal is to recreate enough of the original machine that the recovered 386 control ROM can…

computer-architecturecomputer-science
DEV Community

External GPU (eGPU) + NVIDIA Drivers on Linux: Solving the Display Manager Initialization Problem TL;DR: If your NVIDIA eGPU works in recovery mode but gives a black screen on normal boot, you're missing one critical Xorg option: AllowExternalGpus . This guide shows how to fix it properly on any X11-based Linux distribution. Introduction Installing NVIDIA drivers on a Linux system with an externa…

computer-architectureengineering
Semiconductor Engineering

A new technical paper, “Emulation-based System-on-Chip Security Verification: Challenges and Opportunities,” was published by researchers at University of Florida. Abstract “Increasing system-on-chip (SoC) heterogeneity, deep hardware/software integration, and the proliferation of third-party intellectual property (IP) have brought security validation to the forefront of semiconductor design. Whi…

computer-architectureelectrical-engineeringengineering
Hacker News
Jason Robert Carey Patterson; Last Updated Mar
4/12/2026

WARNING: This article is meant to be informal and fun! Okay, so you're a CS graduate and you did a hardware course as part of your degree, but perhaps that was a few years ago now, and you haven't really kept up with the details of processor designs since then. In particular, you might not be aware of some key topics that developed rapidly in recent times... - pipelining (superscalar, OOO, VLIW, …

computer-architecturecomputer-science
SIGARCH

For decades, we have designed chips in fundamentally the same way: human intuition applied to a vanishingly small slice of an impossibly large design space. That paradigm worked when Moore’s Law was lifting everything. We could afford to be wrong. We could afford to miss the best design. Process scaling would close the gap. That […]

computer-architecturecomputer-science
DEV Community

Over about 10 weeks, I built a bare-metal SPMC at S-EL2 that boots Linux, manages Secure Partitions, and runs alongside Android pKVM on the same SoC. I built an ARM64 hypervisor that runs next to Google's pKVM on the same chip. pKVM takes the Normal world at NS-EL2. My hypervisor takes the Secure world at S-EL2. They coordinate through ARM's FF-A protocol, relayed by EL3 firmware. 35 end-to-end t…

computer-architecturecomputer-science
WWW Computer Architecture Page
4/8/2026

IEEE Transactions on Parallel and Distributed Systems (TPDS) -- Special Issue on CMP Architectures

computer-architecturecomputer-science
Cryptology ePrint Archive

Paper 2026/677 SPLASH: SPeculative Leakage-Adaptive Secure Hardware Abstract Modern processors are largely fixed at the time of fabrication, rendering post-silicon security updates infeasible. This lack of flexibility is especially problematic for speculative execution attacks, which exploit microarchitectural optimizations to leak sensitive information through transient execution. However, exist…

computer-architectureengineering
Semiconductor Digest

Today, Cadence announced an expansion of its broad collaboration with NVIDIA to accelerate Cadence’s Design for AI and AI for Design strategy. The next generation of agentic AI design solutions includes autonomous, long-running agents that require accelerated, trusted, physics-grounded engines to translate design intent into automated flows, generate designs and debug errors, and manage long, com…

aicomputer-architectureengineeringmachine-learning
Department of Computer Science, Columbia University
SIGARCH

As we close the book on 2025, Computer Architecture Today has seen another successful year of community engagement. We published 29 posts covering a wide spectrum of topics—from datacenter energy-efficiency to the evolving debate on LLMs in peer review, alongside trip reports from our major conferences. I want to thank all our authors for their insights, with special appreciation for those who co…

computer-architecturecomputer-science
SIGARCH

Large language model (LLM) agents are quickly moving from “single agent” to *multi-agent systems*: tool-using agents, planner-orchestrator, debate teams, specialized sub-agents that collaborate to solve tasks. At the same time, the *context* these agents must operate within is becoming more complex: longer histories, multiple modalities, structured traces, and customized environments. This combin…

computer-architecturecomputer-science
IEEE TCCA Blog

It is a sunny morning in the computer architecture research community. In the last few years, our community has multiplied in size, our conferences consistently reach record-high attendance, and the number of active research areas is mind-boggling. Members of our community are recognized with the Turing Award and are leading NSF CISE. While times may be exhilarating, it is important that all of u…

computer-architecturecomputer-science
research.ioresearch.io

Sign up to keep scrolling

Create your feed subscriptions, save articles, keep scrolling.

Already have an account?