
NVIDIA Project DIGITS
Around 2015 I was putting together funds in academia. Convincing IT, senior professors, and finance that yes, it was worth giving me a LOT of cash to build a workstation with multiple GPUs.
…
“No, it isn’t for gaming.”
…
“Yes, it will change the world.”
…
“No, there are no university rules that hardware bought multiple invoices across multiple departments can’t be used in the same box.”
…
“Yes, I’m aware that all my individual quotes are just below the bureaucracy summoning purchase limits.”
…
“Yes I tried random forest with the other stats and ML methods, this really is better. How do I know? Well…”
Deep Learning was taking off, but in the biotech world it was seen as a regular tech update.
I did get the money and built that loud jet engine sounding monster. Among the first software I installed was DIGITS (freshly archived), this was just as keras had it’s release and all I wanted to do was to build a neural network. That little step changed my life.
Today NVIDIA announced hardware also named DIGITS with the claimed performance (or ever near there) it will be as life changing for the early explorers as the DIGITs software was for me.
The $3000 price tag is probably much better justified than it was to another little piece of hardware from last year. I hope to get my hands on a couple of these not just for LLMs but for my first love, images.
From the press release:
GB10 Superchip Provides a Petaflop of Power-Efficient AI Performance
The GB10 Superchip is a system-on-a-chip (SoC) based on the NVIDIA Grace Blackwell architecture and delivers up to 1 petaflop of AI performance at FP4 precision.
GB10 features an NVIDIA Blackwell GPU with latest-generation CUDA® cores and fifth-generation Tensor Cores, connected via NVLink®-C2C chip-to-chip interconnect to a high-performance NVIDIA Grace™ CPU, which includes 20 power-efficient cores built with the Arm architecture. MediaTek, a market leader in Arm-based SoC designs, collaborated on the design of GB10, contributing to its best-in-class power efficiency, performance and connectivity.
The GB10 Superchip enables Project DIGITS to deliver powerful performance using only a standard electrical outlet. Each Project DIGITS features 128GB of unified, coherent memory and up to 4TB of NVMe storage. With the supercomputer, developers can run up to 200-billion-parameter large language models to supercharge AI innovation. In addition, using NVIDIA ConnectX® networking, two Project DIGITS AI supercomputers can be linked to run up to 405-billion-parameter models.
Technically it’s not a proper petaflop at FP4 precision, but I’m ok with that kind of impropriety.
Drunk Bayesians and the standard errors in their interactors
This video is from 2018 and of the interesting things Gelman touches on is the ability of AI to do model fitting automatically. Gelman argues that an AI designed to perform statistical analysis would also need to deal with the implications of Cantor’s diagonal.
Essentially, new models need to be built when new data don’t fit the old model. You go down the diagonal of more data vs increasing model complexity.
This means that an AI cannot have a full model ahead of time, and it must have different modules and an executive function and it must make mistakes. He suggests that AI needs to be more like a human, with analytical and visual modules, and an executive function, rather than a single, monolithic program
Perhaps we aren’t quite there yet but the emerging agentic methods are looking promising in light of his thoughts.
Some cool quotes below that I hope encourage you to watch this longish but lively window into one of the mind of one of the best known Bayesian statisticians.
-> I think that there’s something about statistics in the philosophy of statistics which is inherently untidy
-> …in Bayes you do inference based on the model and that’s codified and logical but then there’s this other thing we do which is we say our inferences don’t make sense we reject our model and we need to change it
-> Our statistical models have to be connected to our understanding of the world
On the reproducibility crisis in science in three parts:
Studies not replicating: This is the most obvious part of the crisis, where a study’s findings are not supported when the study is repeated with new data. This casts doubt on the original claims and the related literature and invalidate a whole sub-field
Problems with statistics: Gelman argues that some studies do not do what they claim to do, often due to errors in statistical analysis or research methods. He gives the example of a study about male upper body strength that actually measured the fat on men’s arms.
Three fundamental problems of statistics:
- generalizing from a sample to a population
- generalizing from a treatment to a control group, and
- generalizing from measurements to underlying constructs of interest
This last point is particularly interesting in the biotech space. Which brings us to,
Problems with substantive theory: Many studies lack a strong connection between statistical models and the real world. A better understanding of mechanisms and interactions is necessary for more accurate inferences. Gelman also discusses the “freshman fallacy,” where researchers assume a random sample of the population is not needed for causal inference, when in fact, it is crucial if the treatment effect varies among people (especially important if you are trying to discover drugs!). He further notes that the lack of theory and mechanisms lead to not being able to estimate interactions, which are crucial.
There are many more topics he covers from p-values and economics to bayesians not being bayesian enough.
As thanks for providing the source of the snowclone title here’s some Static
Image: Vintage European style abacus engraving from New York’s Chinatown. A historical presentation of its people and places by Louis J. Beck (1898). Original from the British Library. Digitally enhanced by rawpixel.
Leave a comment