The Problem Processor architectures have not been rethought for over 25 years, and the end of CMOS scaling will make it increasingly difficult for performance and efficency improvements without a fresh design.

Existing processor architectures were designed in a time where the amount of energy to move data was roughly equal to the amount of energy required to do useful computation with that data.

Energy cost for performing a 64 bit FLOP (In picojoules)

Moving 64 bits from memory takes over 40x more energy than the actual double precision floating point operation being performed with that data.


REX Computing is rethinking the traditional hardware managed cache hierarchy, and in removing unnecessary complexity, is able to significantly reduce power consumption and total area.

The REX Neo Architecture

REX Computing is developing a new, hyper-efficient processor architecture targeting the requirements for the supercomputers of today, and all the computers of tomorrow.

To do this, we are throwing out the feature creep and bloat of processors of the past 30 years, and using improvements in the world of software to greatly simplify the processor itself to only what is necessary.

In doing so, we are able to deliver a 10 to 25x increase in energy efficiency for the same performance level compared to existing GPU and CPU systems

The Neo Chip

256 cores per chip, scratchpad memory, a 2D-mesh interconnect, and a revolutionary high bandwidth chip-to-chip interconnect achieve:

256 GFLOPs (Double Precision) or 512 GFLOPs (Single Precision)
at 64 to 128 GFLOPs/watt

  • Similar performance for integer calculations.
  • Balanced memory bandwidth allows near-theoretical peak performance.
  • Extreme scalability: near limitless number of Neo chips per node.

Neo Toolchain

Software development is an important part of the Neo architecture roadmap. The REX software team is working hard to build an easy-to-use modern toolchain to simplify writing new Neo applications and porting existing software to our platform.

A fundamentally new hardware environment brings about several exciting

Memory Management
  • Neo’s impressive power efficiency relies on many simplifications at the hardware level. As a result, our runtime environment must meet unique requirements to ensure the task of memory management is abstracted away from users without needing complicated garbage collection or language restrictions.
  • The primary space for storing program code and data on Neo is made up of Scratchpad Memory. In other words, our chip does not need all of the logic required for hardware managed caching and coherency that you might find in existing processors and accelerators. Instead, we place a local memory space inside each of our cores that is lower power and lower latency (faster) than traditional caches, giving the REX Neo architecture a significant power and performance advantage over other architectures.
  • Our team shared an update on new techniques at the MEMSYS 2015 conference. You can access our paper here.
General Programmability
  • Architectures that try to achieve goals similar to the Neo chip often rely on application-specific simplifications. For example, GPUs (graphics processing units) are great for tasks like video rendering and machine learning, but are not nearly as competent at running a web browser or operating system as a CPU (central processing unit). Similarly, DSP (digital signal processor) chips are often challenging to program and may rely on hardware accelerators to achieve efficient performance in many applications.  We are not immune from these challenges, but are instead taking a different approach.
  • Our hardware simplifications are focused on exposing low level functionality. Rather than designing to provide higher level interfaces to programmers, which would require making many assumptions about workload characteristics, we strip away abstractions and implement them in software. Once a feature exists in software, it can easily be enabled or disabled at will, or even modified as application requirements change. Our hardware and software development teams work together very closely to strike a balance between extreme efficiency and reconfigurability.
  • Neo does not force software to adhere to any one programming model. Flexibility of execution modes and access to time-predictable atomic core-to-core messaging can support a variety of higher level models.
    Our custom ISA and hardware-aware optimization approach aims to maximize parallelism and enable rapid prototyping using many higher level languages.


Thomas worked at the MIT Institute for Soldier Nanotechnologies for 3 years as both an end user of HPC systems, and later transitioned into designing and building them at the lab. This experience led to starting REX Computing in 2013 as a recipient of Peter Thiel’s “20 Under 20” Fellowship, where he leads architectural design and operations. Thomas has been featured on Forbes’ 30 under 30 list and is a project lead for the Open Compute Project HPC Group.


Paul started programming as a child, and studied CS at Georgia Tech. He has worked in fields including structural biology, theoretical ecology, and nanofabrication. Paul was part of the 2012 class of Thiel Fellows, where he founded a synthetic biology startup and worked at Lawrence Berkeley National lab for 18 months. He later joined Thomas in starting REX, where he contributes an extensive knowledge of low level software and tool development


Our Advisors

  • John Gustafson: Visiting Professor, A*STAR Computational Resource Centre
    former Director of eXtreme Technologies Lab, Intel
    former Chief Product Architect, AMD.

  • Bill Boas: Chairman, System Fabric Works
    former Director, Cray
    former Advanced Architectures Team, Lawrence Livermore National Lab

  • Kevin Moran: CEO, System Fabric Works.


Moving data on a chip can take 40 times more energy than to compute that same amount of data. It also slows things down. So by tweaking the memory to make it less locked down in the hardware, REX is saving energy and boosting speed.
The company recently received $1.25 million in funding from Founders Fund, a venture capital firm cofounded by Peter Thiel.
The result of his research is an up and coming chip called Neo, which brings to bear a new architecture, instruction set, and core design and if assumptions are correct, can do this in a power envelope and performance target that goes beyond the current requirements for exascale computing goals.

Sign up for our mailing list

* indicates required
Primary Interest

Contact Us