Tag: AI

  • The Shelter as Epistemic Engine

    This is a continuation of my ongoing exploration of places and spaces. Previously: We need homes in the delta quadrant, Thinking with places, Problems are places questions are spaces.


    Introduction: The Terror of the Open Field

    We tend to think of “Space” as a vacuum—an emptiness waiting to be filled. But geographically and philosophically, Space is actually a condition of high-entropy potential. As Yi-Fu Tuan famously articulated, space is “freedom,” but it is also “possibility without orientation.” It is the open field where everything is possible, which means nothing is yet distinct.

    Entering a new scientific field is remarkably similar to entering a strange, sprawling city at night. Both are vast, unmapped, and overwhelming in their sensory input; the streets (or citations) wind in directions you cannot predict, and the logic of the layout remains hidden. You are surrounded by data, but devoid of information.

    In this state, you cannot simply “exist.” Without a point of reference—a coordinate, a hypothesis, a base camp—movement is indistinguishable from drift. To explore a new territory, whether it is the Delta Quadrant or a novel theory of computation, you first need a place to stand.

    We often mistake “Places”—our homes, our labs, our established theories—for static containers designed to protect us from the unknown. We view them as retreats.

    I propose a different view: Real “homes” are not retreats; they are Concreteness Engines. They are the active, necessary interruptions of infinity that allow us to process the world.

    “Exploration of space through the affordance of places. Identity creation.” by Venkatesh Rao’s Bucket Art prompted by me

    I. The Engine: Configurancy

    To understand how a home functions as an engine, we need to look at the underlying physics of how things fit together. Venkatesh Rao recently proposed a new ontological primitive for this, a concept he calls Configurancy.

    Rao defines Configurancy as the “ongoing, relational, temporally unfolding process through which agents and worlds co-emerge.” It is non-teleological; it doesn’t have a “goal” like Heidegger’s Care. It is simply the structural logic of how elements align to create a world that hangs together.

    This provides the missing mechanical link in our understanding of place-making.

    The universe’s configurancy has no inherent goal—it just is. Entropy and evolution shuffle relations without asking why. But humans do have a goal: intelligibility. We need the world to make sense.

    Here lies the synthesis: Place-making is the manual application of configurancy.

    When we build a home in the unknown, we are engaged in the active engineering work of aligning data, tools, and protocols. We are taking the raw, washing-over “Space” and forcing it into a relational alignment that makes it navigable. We are taking the background hum of the universe and tuning it until it resonates as a signal.

    II. The Anchor: Generating Concreteness

    The primary problem with the unknown is not that it is empty, but that it is slippery. It is purely abstract. You cannot interact with “The Literature” or “The Market” or “The Frontier” as a whole; the bandwidth is simply too high.

    A “Home”—whether that is a physical shelter, a published paper, or a foundational startup thesis—functions by freezing the flow. It creates a local boundary where active relations stabilize long enough to be examined.

    Consider the mechanism of a scientific citation. A natural phenomenon is dynamic, messy, and fleeting. But when a scientist writes a paper, they freeze that dynamic phenomenon into a static reference. They turn the anomaly into a “Fact.”

    Similarly, in a city, a “landmark” freezes the endless flow of streets into a fixed coordinate. “Meet me at the clock tower” turns a grid of infinite motion into a singular point of orientation.

    This is the epistemic function of shelter. A home doesn’t just hide us from the wind; it renders reality. It is a processing center that turns abstract “Space” into concrete “Place,” giving us a tangible handle on the world.

    III. The Trajectory: Carrying the Protocol

    There is a trap here, however. We can easily fall into “Container Metaphysics”—the belief that the Anchor is the point. If we believe the safety of the shelter is the goal, we stop exploring. We get stuck in the comfort of the known, resulting in stasis.

    True exploration is not wandering; it is the ability to carry the protocol of place-making with you. This is what Rao might describe as “high configurancy”—a state where the relational structure is stable enough to evolve, but fluid enough to move.

    We can distinguish here between the Tourist and the Explorer.

    • The Tourist wanders through Space, relying on pre-existing places made by others. They consume intelligibility.
    • The Explorer generates Place. They are capable of “tear-down” and “re-configuration.”

    The Explorer understands that the shelter is not a final destination. It is a platform to project into the unknown. We build the base camp not to live in it forever, but to inhabit the transition between the known and the unknown.

    IV. The Explorer’s Stack

    To survive and understand the unknown, we don’t build fortresses of stone; we build Stacks of intelligibility. If we look at the architecture of a “Home” in the Delta Sector, it breaks down into three layers:

    1. The Physical Layer (Hardware)

    This is the instrument, the sensor, the wall, the hull of the ship.

    Function: The hard interface that touches raw physics and space. It provides the minimum viable protection required to exist.

    2. The Protocol Layer (Configurancy)

    This is the Scientific Method, the “Rules of Thumb,” the cultural habits, the checklist.

    Function: This is the engine room. It is the code that aligns the observer with the territory. It is the set of relational instructions that tells us how to organize the chaos outside into a pattern inside.

    3. The Interface Layer (Meaning)

    This is the sense of “Place.” The feeling of “I know where I am.”

    Function: The dashboard where alignment registers as understanding. This is where the raw data of the physical layer, processed by the protocol layer, renders as a world we can inhabit.

    Conclusion: Orientation is the Precondition for Motion

    Rao’s Configurancy and the model of Place-making describe the same fundamental truth: Being is the act of alignment.

    We build homes in the unknown—whether that is a literal frontier or a new intellectual discipline—not to hide from the reality of it, but to have a “runtime environment” where we can compile the code to understand it.

    Place is not a retreat from the world. It is the processing center required to render the world concrete enough to be explored. We do not leave the Delta Quadrant to go home; we build a home so that the Delta Quadrant becomes a place we can finally see.

  • The Tortured Artist Is So Yesterday

    41 years ago, Samuel Lipman wrote that an artist’s life is a “constant—and constantly losing—battle” against one’s own limits. That image has lasted because print culture taught us to imagine the artist as a solitary figure whose worth is measured by the perfection of a single, final work. Print fixed texts in place, elevated the individual author, and made loneliness part of the creative job description.

    That world is slipping away.
    And with it, the tortured artist.

    Twittering Machine (Die Zwitscher-Maschine) is a 1922 watercolor with gouache, pen-and-ink, and oil transfer on paper by Swiss-German painter Paul Klee

    LLMs have made competent expression abundant. The blank page no longer terrifies; anyone can produce something fluent and polished. When craft becomes cheap, suffering loses its meaning as a marker of artistic seriousness. What becomes scarce instead is the willingness to take a risk—not in private, but in public, where a stance can fail, provoke, or be reshaped by others.

    Venkatesh Rao recently argued that authorship is no longer about labor but about courage: the courage to commit to a line of thought and accept the consequences of being wrong. In an era of infinite variations, the decisive act is not creation but commitment. The value lies in staking something of yourself on an idea that may not survive.

    This shift is reshaping where culture is made. In what I’ve called the “Cloister Web,” people draft and explore ideas in semi-private creative rooms before carrying only a few into the open. LLMs make experimentation cheap; they also make commitment expensive. The hard part now is choosing which idea you are willing to be accountable for.

    As the burden of execution drops, something else rises: genuine collaboration. Not just collaboration with models, but with other humans. Andrew Gelman, reflecting on Lipman in a recent StatModeling post, noted that scientists, too, feel versions of this pressure of the solitary creator. In science, the burden rarely falls on one person. The struggle is distributed across collaborative projects that outlive any single contributor.

    Groups can explore bolder directions than any one creator working alone. Risk spreads, ideas compound, and the scale of what can be attempted expands. The solitary genius was an artifact of print; the collaborative creative lab is the natural form of the world we are entering.

    This leads to a claim many will resist but few will be able to ignore: the single author is beginning to collapse as a cultural technology. What will matter in the coming decades is not the finished artifact but the evolving line of thought carried forward by teams willing to take risks together.

    The tortured artist belonged to an age defined by scarcity, perfection, and solitude. Today’s creator faces a different task: to choose a risk worth taking and the collaborators worth taking it with. The work endures not because it is flawless, but because a group has committed to pushing it forward.

    Pain is optional now.

    Risk isn’t.

  • Four Early-Modern Tempers for a World That Can Summon Itself

    This is a partial synthesis of the books read through 2025 in the Contraptions Book Club.

    We live in a moment when the whole of human culture has become strangely available, no longer just an archive but something that behaves like a responding presence. A sentence typed into a search bar or messaging window returns citations and, more strikingly, continuations: pastiche, commentary, new variations of ideas that never existed until the instant we requested them. The canon now behaves more like a voice than a library. It is easy to treat this as convenience, yet summoning culture alters our relation to meaning in ways we are only beginning to see. The question is no longer whether we can find the relevant text, but what it means to think in a world that can generate its own echoes.

    This instability has precedents in the late fifteenth and sixteenth centuries. Print multiplied texts; voyages multiplied worlds; the Reformation multiplied authorities. Four writers—Thomas More, Michel de Montaigne, Giordano Bruno, and Ibn Khaldun—stood at different corners of that era’s turbulence. Read from a certain angle, they reveal four temperaments that recur whenever the world grows larger and more articulate than before. They capture four ways of holding meaning in a world where frames widen and boundaries blur.

    Their temperaments arose under the tension of two kinds of pressures. One pressure concerns frame: how much of the world a thinker attempts to hold in view. Another concerns form: how rigidly one tries to shape or preserve meaning in the face of flux. The tension between narrow and wide frames, between hard and soft forms, is a recurring feature of intellectual upheaval. It is with us again.

    Northeaster (1895) by Winslow Homer. Original from The MET museum.

    Thomas More and the dream of designed simplicity

    When More wrote Utopia, he was answering a world that felt newly disordered: economic enclosure, fracturing religion, unfamiliar continents, and the early tremors of what we now call modernity. His response was to shrink the frame to a bounded island and then remake that island according to simple, intelligible rules. Clothes are standardized; work is scheduled; houses interchangeable. Property, that generator of complexity, is abolished.

    This gesture—the compression of a vast, unruly world into a legible miniature—reflects a deep conviction that the good life can be engineered by eliminating what does not fit the plan. Yet much of what makes human life livable emerges not from design but from the unplanned: the pleasure of choosing one’s clothes, improvising a routine, rearranging a room, wandering through a market whose wares no one fully controls. These small freedoms, these ambient textures, carry a kind of happiness that explicit blueprints rarely acknowledge. More’s island, for all its order, feels airless because it denies the subtle satisfactions of emergence.

    We still see this impulse today, whenever we imagine that meaning will return if only we can simplify the world enough—reduce choices, curtail variation, enforce legibility. It is a refusal to accept that complexity is a problem to be solved only up to a point, beyond which it becomes the medium of human flourishing.

    Montaigne and the work of making knowledge one’s own

    Montaigne faced the same expansion of texts and reports, but his answer was almost the inverse of More’s. In his Essays, he turned the proliferating world into material for a sustained inquiry into a single life—his own. He narrowed the frame even further—not to an island but to a single life—and then allowed that life’s boundaries to loosen. His essays are records of a mind being changed by what it reads and observes. They are porous documents, absorbing classical quotations, passing impressions, and the texture of his shifting moods.

    He described this process with the image of bees making honey: they gather from thyme and marjoram, but the result is neither; the ingredients have been transformed.  Mere access to texts is not enough. The material must be digested until it becomes inseparable from the person who has absorbed it. 

    This is a temperament well suited to a world in which culture can speak back in any tone we request. The ease of access makes superficial familiarity almost effortless; the difficulty lies in allowing the material to ferment into something one can honestly call one’s own. Montaigne’s form is soft, because he does not impose a system on the world or on himself. He lets contradictions remain. His essays show what inward honesty looks like when the outer world has grown noisy.

    Bruno: infinite worlds, unreliable memory

    If Montaigne compresses the world into a single consciousness, Bruno explodes it. In works such as On the Infinite Universe and Worlds, he offered a speculative cosmology that pushed beyond the scientific imagination of his time. His universe is infinite, populated by innumerable worlds, animated by a universal divinity. These were not scientific inferences—they were imaginative leaps, metaphysical provocations in a period when the cosmological picture was coming loose.

    Bruno’s response to the widened cosmos led him to enlarge the frame until it became boundless. Boundaries, for him, were treated as provisional, always liable to be surpassed. He was fascinated by memory—its limits, its artifices, its potential for augmentation. His elaborate mnemonic wheels were attempts to externalize thinking, to allow a mind to move through more space than it could otherwise hold.

    There is something oddly familiar in this, not because our devices prove Bruno right, but because they echo his aspirations. We have built systems that externalize memory, recombine fragments, and present them as if they had always existed. These contrivances are not cosmic, yet they invite a cosmic mood—a sense that boundaries have thinned, that the archive stirs, that the mind can wander farther than it once could. Bruno illustrates the allure and the danger: the exhilaration of boundless possibility, and the risk of believing that imagination alone can stand in for contact with the world.

    Ibn Khaldun and patterns at civilizational scale

    Ibn Khaldun took the widening of the world seriously, but he kept his feet on the ground. In the Muqaddimah, his great introduction to history, he sketched a theory of how societies cohere, flourish, and decline. His frame is large—empires, dynasties, generations—yet his form is restrained. He offers no blueprint for an ideal state. He offers something closer to a natural history of political life: groups harden and cohere, conquer, soften, decay, and are replaced. Boundaries matter to him—the line between desert and city, between ruler and ruled—but they are not eternal. They shift, erode, reemerge.

    His stance avoids both utopian control and ecstatic dissolution. It is descriptive, analytical, patient. He wants to see how things actually behave across time. In a world that now contains its own searchable memory and can generate plausible continuations of its past, this way of looking feels newly relevant. The swirl of events becomes legible only when placed against deeper patterns. Ibn Khaldun’s gift is to show that large frames can coexist with modesty of form.

    Two diagonals

    One can sense two lines running through these four positions. On one line are More and Bruno—the designer of tight enclosures and the dissolver of all enclosures. Both feel the shock of a world grown too large and respond by refusing its messiness: one by shrinking it to a legible fragment, the other by exploding it into a metaphysical totality. Both try to replace the world’s emergent complexity with a clarity of their own making.

    The other line runs between Montaigne and Ibn Khaldun. Both accept that the world, whether at the scale of a single life or of a civilization, has a texture that cannot be fully captured by design or metaphysics. Both are interested in how things actually unfold, without forcing them into an ideal shape. Their frames differ—one intimate, one panoramic—but their attitude toward form is similar: let patterns emerge, let boundaries be porous enough to reveal movement, let humility guide description.

    This second diagonal sits more naturally with a culture that can be summoned on demand. When the archive can answer back in endless variations, attempts to design simplicity or to dissolve all limits tend to fatigue. What remains workable is the inward practice of belonging to oneself and the outward practice of reading patterns without imagining them eternal.

    Temper temper

    We now inhabit a world in which knowledge behaves differently than any earlier generation anticipated. It can be queried, ventriloquized, recombined. This does not tell us how to live, but it changes the background against which living takes place. More’s dream of a perfectly designed order feels at once more possible and more implausible. Montaigne’s slow digestion of borrowed thought feels newly demanding. Bruno’s intoxication with boundlessness feels familiar, and Ibn Khaldun’s attention to cycles and decay feels newly sober.

    These tempers recur whenever the world becomes more articulate than before. Ours is such a moment. We can now create stable points of reference with enough meaning and legibility to allow exploration of surrounding space. Print unlocked the beta version of this superpower. These four writers, shaped by the last great expansion of the world’s voice, find themselves speaking again through us, as we try to understand what it means to think with culture on tap.

    Fediverse Reactions
  • The Small God of the Internet

    It was a small announcement on an innocuous page about “spring cleaning”. The herald, some guy with the kind of name that promised he was all yours. Four sentences you only find because you were already looking for a shortcuts through life. A paragraph, tidy as a folded handkerchief, explained that a certain popular reader of feeds was retiring in four months’ time. Somewhere in the draughty back alleys of the web, a small god cleared his throat. Once he had roared every morning in a thousand offices. Now, when people clicked for their daily liturgy, the sound he made was… domesticated.

    He is called ArrEsEs by those who enjoy syllables. He wears a round orange halo with three neat ripples in it. Strictly speaking, this is an icon1, but gods are not strict about these things. He presides over the River of Posts, which is less picturesque than it sounds and runs through everyone’s house at once. His priests are librarians and tinkerers and persons who believe in putting things in order so they can be pleasantly disordered later. The temple benches are arranged in feeds. The chief sacrament is “Mark All As Read,” which is the kind of absolution that leaves you lighter and vaguely suspicious you’ve got away with something.

    Guide for Constructing the Letter S from Mira Calligraphiae Monumenta or The Model Book of Calligraphy (1561–1596) by Georg Bocskay and Joris Hoefnagel. Original from The Getty. Digitally enhanced by rawpixel.

    There was a time the great city-temples kept a candle lit for him right on their threshold. The Fox of Fire invited him in and called it Live Bookmarks.2 The moldable church, once a suit, then a car, then a journey, in typical style stamped “RSS” beside the address like a house number. The Explorer adopted the little orange beacon with the enthusiasm of someone who has been told there will be cake. The Singers built him a pew and handed out hymnals. You could walk into almost any shrine and find his votive lamp glowing: “The river comes this way.” Later, accountants, the men behind the man who was yours, discovered that candles are unmonetizable and, one by one, the lamps were tidied into drawers that say “More…”.

    ArrEsEs has lineage. Long before he knocked on doors with a bundle of headlines, there was Old Mother Press, the iron-fingered goddess of moveable type, patron of ink that bites and paper that complains. Her creed was simple: get the word out. She marched letters into columns and columns into broadsides until villages woke up arguing the same argument.3* ArrEsEs is her great-grandchild—quick-footed, soft-spoken—who learned to carry the broadsheet to each door at once and wait politely on the mat. He still bears her family look: text in tidy rows, dates that mind their place, headlines that know how to stand up straight.**

    Four months after the Announcement, the big temple shut its doors with a soft click. The congregation wandered off in small, stubborn knots and started chapels in back rooms with unhelpful names like OGRP4. ArrEsEs took to traveling again, coat collar up, suitcase full of headlines, knocking on back doors at respectable intervals. “No hurry,” he would say, leaving the bundle on the step. “When you’re ready.” The larger gods of the Square ring bells until you come out in your slippers; this one waits with the patience of bread.

    Like all small gods, he thrives on little rites. He smiles when you put his name plainly on your door: a link that says feed without a blush. He approves of bogrolls blogrolls, because they are how villages point at one another and remember they are villages. He warms to OPML, which is a pilgrim’s list people swap like seed packets. He’s indulgent about the details—/rss.xml, /atom.xml, /feed, he will answer to all of them—but he purrs (quietly; dignified creature) for a cleanly formed offering and a sensible update cadence5.

    His miracles are modest and cannot be tallied on a quarterly slide. He brings things in the order they happened. He does silence properly. The river arrives in the morning with twenty-seven items; you read two, save three, and let the rest drift by with the calm certainty that rivers do not take offense. He remembers what you finished. He promises tomorrow will come with its own bundle, and if you happen to be away, he will keep the stack neat and not wedge a “You Might Also Like” leaflet between your socks.

    These days, though, ArrEsEs is lean at the ribs. The big estates threw dams across his tributaries and called them platforms. Good water disappeared behind walls; the rest was coaxed into ornamental channels that loop the palace and reflect only the palace. Where streams once argued cheerfully, they now mutter through sluices and churn a Gloomwheel that turns and turns without making flour—an endless thumb-crank that insists there is more, and worse, if you’ll just keep scrolling. He can drink from it, but it leaves a taste of tin and yesterday’s news.

    A god’s displeasure tells you more than his blessings. His is mild. If you hide the feed, he grows thin around the edges. If you build a house that is only a façade until seven JSters haul in the furniture, he coughs and brings you only the headline and a smell of varnish6. If you replace paragraphs with an endless corridor, he develops the kind of seasickness that keeps old sailors ashore. He does not smite. He sulks, which is worse, because you may not notice until you wonder where everyone went.

    Still, belief has a way of pooling in low places. In the quiet hours, the little chapels hum: home pages with kettles on, personal sites that remember how to wave, gardeners who publish their lists of other gardeners. Somewhere, a reader you’ve never met presses a small, homely button that says subscribe. The god straightens, just a touch. He is gentler than his grandmother who rattled windows with every edition, but the family gift endures. If you invite him, tomorrow he will be there, on your step, with a bundle of fresh pages and a polite cough. You can let him in, or make tea first. He’ll wait. He always has.


    Heavily edited sloptraption.


    1. He maintains it’s saffron, which is what halos say when they are trying to be practical ↩︎
    2. The sort of feature named by a librarian, which is to say, both accurate and doomed. ↩︎
    3. Not to be confused with the software that borrowed her title and a fair chunk of her patience. ↩︎
    4. Old Google Reader People ↩︎
    5. On festival days he will accept serif, sans-serif, or whatever the village printer has not yet thrown at a cat.
      ↩︎
    6. He can drink JSON when pressed; stew remains his preference. ↩︎
    Fediverse Reactions
  • Why Every Biotech Research Group Needs a Data Lakehouse

    start tiny and scale fast without vendor lock-in

    All biotech labs have data, tons of it. The problem is the same across scales. Accessing data across experiments is hard. Often data simply gets lost on somebody’s laptop with a pretty plot on a poster as the only clue it ever existed. The problem is almost insurmountable if you try to track multiple data types. Trying to run any kind of data management activity used to have large overhead. New technology like DuckDB and their new data lakehouse infrastructure, DuckLake, try to make it very easy to adopt and scale with your data. All while avoiding vendor lock-in.

    American Scoter Duck from Birds of America (1827) by John James Audubon (1785 – 1851 ), etched by Robert Havell (1793 – 1878).

    The data dilemma in modern biotech

    High-content microscopy, single-cell sequencing, ELISAs, flow-cytometry FCS files, Lab Notebook PDFs—today’s wet-lab output is a torrent of heterogeneous, PB-scale assets. Traditional “raw-files-in-folders + SQL warehouse for analytics” architectures break down when you need to query an image-derived feature next to a CRISPR guide list under GMP audit. A lakehouse merges the cheap, schema-agnostic storage of a data lake with the ACID guarantees, time-travel, and governance of a warehouse—on one platform. Research teams, at discovery or clinical trial stages, can enjoy faster insights, lower duplication, and smoother compliance when they adopt a lakehouse model .

    Lakehouse super-powers for biotech

    • Native multimodal storage: Keep raw TIFF stacks, Parquet tables, FASTQ files, and instrument logs side-by-side while preserving original resolution.
    • Column-level lineage & time-travel: Reproduce an analysis exactly as of “assay-plate upload on 2025-07-14” for FDA, EMA, or GLP audits.
    • In-place analytics for AI/ML: Push DuckDB/Spark/Trino compute to the data; no ETL ping-pong before model training.
    • Cost-elastic scaling: Store on low-cost S3/MinIO today; spin up GPU instances tomorrow without re-ingesting data.
    • Open formats: Iceberg/Delta/Hudi (and now DuckLake) keep your Parquet files portable and your exit costs near zero .

    DuckLake: an open lakehouse format to prevent lock-in

    DuckLake is still pretty new and isn’t quite production ready, but the team behind it is the same as DuckDB and I expect they will deliver high quality as 2025 progresses. Datalakes or even lakehouses, are not new at all. Iceberg and Delta pioneered open table formats, but still scatter JSON/Avro manifests across object storage and bolt on a separate catalog database. DuckLake flips the design: all metadata lives in a normal SQL database, while data stays in Parquet on blob storage. The result is simpler, faster, cross-table ACID transactions—and you can back the catalog with Postgres, MySQL, MotherDuck, or even DuckDB itself .

    Key take-aways:

    • No vendor lock-in: Because operations are defined as plain SQL, any SQL-compatible engine can read or write DuckLake—good-bye proprietary catalogs.
    • Start on a laptop, finish on a cluster: DuckDB + DuckLake runs fine on your MacBook; point the same tables at MinIO-on-prem or S3 later without refactoring code.
    • Cross-table transactions: Need to update an assay table and its QC log atomically? One transaction—something Iceberg and Delta still treat as an “advanced feature.”

    Psst… if you don’t understand or don’t care what ACID, manifests, or object stores mean, assign a grad student, it’s not complicated.

  • Work or Play? Ludic Feedback Loops

    In his substack post today, Venkatesh Rao wrote about reading and writing in the age of LLMs as playing and making toys respectively. In one part he writes about how the dopamine feedback loop from writing drove his switch from engineering to writing. For him, writing has ludic, play-like, qualities.

    Japanese vintage original woodblock print of birds and butterfly from Yatsuo no tsubaki (1860-1869) by Taguchi Tomoki.

    I have made almost all my “career” decisions as a function of play. I originally started off with a deep love of plants, how to grow them and their impact on the world. I was convinced I was going to have a lot of fun. I did have some. My wonderful undergrad professor literally hand held me through my first experiments growing tobacco plants from seeds. But that was about it. My next experiment was with woody plants and growing the seeds alone took 6 months, and by the end I had 4 measly leaves to experiment with. I quickly switched to cell biology.

    This one went a bit better and I stayed with the medium through PhD. Although I was having sufficient aha moments, I knew in the first year that it was still a bit slow. What rescued me was my refusal to do manual analysis. I loved biology but I refused to sit and do analysis manually. Luckily, I had picked up sufficient programming skills.

    I could reasonably automate, the analysis workflow. It was difficult at first but the error messages came at the rate I needed them to. I found new errors viscerally rewarding, it was now in game territory. The analysis still held meaning, it wasn’t for some random A/B testing or some Leet code thing. No, this mattered.

    Machine learning, deep learning, LLMs, and their applications in bio continue to enchant me. I can explore even more with the same effort and time. I interact with biology at the rate of dopamine feedback I need. I have found my ludic frequency.

    Fediverse Reactions
  • Briefing: The State of Explainable AI (XAI) and its Impact on Human-AI Decision-Making


    This post is a sloptraption, my silk thread in the CloisterWeb. The post was made with the help of NotebookLM. You can chat with the essay and the sources here: XAI NotebookLM Chat


    I. Executive Summary

    The field of Explainable AI (XAI) aims to make AI systems more transparent and understandable, fostering trust and enabling informed human-AI collaboration, particularly in high-stakes decision-making. Despite significant research efforts, XAI faces fundamental challenges, including a lack of standardized definitions and evaluation frameworks, and a tendency to prioritize technical “faithfulness” over practical utility for end-users. A new paradigm emphasizes designing explanations as a “means to an end,” grounded in statistical decision theory, to improve concrete decision tasks. This shift necessitates a human-centered approach, integrating human factors engineering to address user cognitive abilities, potential pitfalls, and the complexities of human-AI interaction. Practical challenges persist in implementation, including compatibility, integration, performance, and, crucially, inconsistencies (disagreements) among XAI methods, which significantly undermine user trust and adoption.

    Poppies and Daisies (1867) by Odilon Redon. Original from the Art Institute of Chicago. Digitally enhanced by rawpixel.

    II. Core Concepts and Definitions

    • Explainable AI (XAI): A research area focused on making AI system behaviors and decisions understandable to humans, aiming to increase trustworthiness, transparency, and usability. The term itself gained prominence around 2016, though the need for explainability in AI has existed for decades.
    • Contextual Importance and Utility (CIU): A model-agnostic, universal foundation for XAI based on Decision Theory. CIU extends the traditional linear notions of “importance” (of an input) and “utility” (of an input value toward an outcome) to non-linear AI models. It explicitly quantifies how the importance of an input and the utility of its values change based on other input values (the “context”).
    • Contextual Importance (CI): Measures how much modifying a given set of inputs in a specific context affects the output value.
    • Contextual Utility (CU): Quantifies how favorable (or unfavorable) a particular input value is for the output in a given context, relative to the minimal and maximal possible output values.
    • Distinction from Additive Feature Attribution Methods (e.g., LIME, SHAP): CIU is theoretically more sound for non-linear models as it considers the full range of input variations, not just local linearity (partial derivatives). Additive methods lack a “utility” concept and might produce misleading “importance” scores in non-linear contexts.
    • Decision Theory: “A branch of statistical theory concerned with quantifying the process of making choices between alternatives.” It provides clear definitions of input importance and utility, intended to support human decision-making.
    • Human Factors Engineering (HFE): An interdisciplinary field focused on optimizing human-system interactions by understanding human capabilities and limitations. It aims to design systems that enhance usability, safety, and efficiency, and is crucial for creating human-centered AI.
    • Key HFE Principles: User-Centered Design, Minimizing Cognitive Load, Consistency and Predictability, Accessibility and Inclusivity, Error Prevention and Recovery, Psychosocial Considerations, Simplicity and Clarity, Flexibility and Efficiency, and Feedback.
    • Explainability Pitfalls (EPs): Unanticipated negative downstream effects from adding AI explanations that occur without the intention to manipulate users. Examples include misplaced trust, over-estimating AI capabilities, or over-reliance on certain explanation forms (e.g., unwarranted faith in numerical explanations due to cognitive heuristics). EPs differ from “dark patterns,” which are intentionally deceptive.
    • Responsible AI (RAI): A human-centered approach to AI that “ensures users’ trust through ethical ways of decision making.” It encompasses several core pillars:
    • Ethics: Fairness (non-biased, non-discriminating), Accountability (justifying decisions), Sustainability, and Compliance with laws and norms.
    • Explainability: Ensuring automated decisions are understandable, tailored to user needs, and presented clearly (e.g., through intuitive UIs).
    • Privacy-Preserving & Secure AI: Protecting data from malicious threats and ensuring responsible handling, processing, storage, and usage of personal information (security is a prerequisite for privacy).
    • Trustworthiness: An outcome of responsible AI, ensuring the system behaves as expected and can be relied upon, built through transparent, understandable, and reliable processes.

    III. Main Themes and Important Ideas

    A. The Evolution and Current Shortcomings of XAI Research

    • Historical Context: The need for explainability in AI is not new, dating back to systems like MYCIN in 1975, which struggled to explain numerical model reasoning. Early efforts focused on “intrinsic interpretability” or “interpretable model extraction” (extracting rules from models), while “post-hoc interpretability” (explaining after the fact) was proposed as early as 1995 but initially neglected.
    • Modern Re-emergence and Limitations: The term “Explainable AI (XAI)” was popularized around 2016, but current research often “tends to ignore existing knowledge and wisdom gathered over decades or even centuries by other relevant domains.” Most XAI work relies on “researchers’ intuition of what constitutes a ‘good’ explanation, while ignoring the vast and valuable bodies of research in philosophy, psychology, and cognitive science of how people define, generate, select, evaluate, and present explanations.”
    • Focus on Technical Metrics over User Utility: Many XAI papers prioritize “internal validity like deriving guarantees on ‘faithfulness’ of the explanation to the model’s underlying mechanisms,” rather than focusing on how explanations improve human task performance. This can lead to methods that are “non-robust or otherwise misleading.”
    • The “Disagreement Problem”: A significant practical challenge where different XAI methods (e.g., SHAP, LIME) generate “conflicting explanations that lead to feature attributions and interpretability inconsistencies,” making it difficult for developers to trust any single explanation. This is reported as the most severe challenge by practitioners, despite being less frequently reported as an initial technical barrier.

    B. The “Means to an End” Paradigm for XAI

    • Explanations as Decision Support: A core argument is that “explanations should be designed and evaluated with a specific end in mind.” Their value is measured by the “expected improvement in performance on the associated task.”
    • Formalizing Use Cases as Decision Problems: This framework suggests representing tasks as “decision problems,” characterized by actions under uncertainty about the state of the world, with a utility function scoring action-state pairs. This forces specificity in claims about explanation effects.
    • Value of Information: Explanations are valuable if they convey information about the true state to the agent, either directly (e.g., providing posterior probability) or indirectly (helping the human better integrate existing information into their decision).
    1. Three Definitions of Explanation Value:Theoretic Value of Explanation (∆E): The maximum possible performance improvement an idealized, rational agent could gain from accessing all instance-level features (over no information). This acts as a sanity check: if this value is low, the explanation is unlikely to help boundedly rational humans much.
    2. Potential Human-Complementary Value of Explanation (∆Ecompl): The potential improvement the rational agent could gain from features beyond what’s already contained in human judgments.
    3. Behavioral Value of Explanation (∆Ebehavioral): The actual observed improvement in human decision performance when given access to the explanation, compared to not having it (measured via randomized controlled experiments).
    • Critique of Idealized Agent Assumption: While explanations offer no additional value to an idealized Bayesian rational agent (as they are a “garbling” of existing information), they are crucial for imperfect human agents who face cognitive costs or may be misinformed or misoptimizing.

    C. The Critical Role of Human Factors and Human-Centered AI

    • Bridging Algorithmic Complexity and Human Understanding: HFE is essential to “bridge algorithmic complexity with actionable understanding” by ensuring AI systems align with human cognitive abilities and behavioral patterns.
    • Addressing Unintentional Negative Effects (EPs): HFE provides strategies to anticipate and mitigate EPs, such as designing for “user reflection (as opposed to acceptance)” by promoting “mindful and deliberative (system 2) thinking.”
    • Case Study (Numerical Explanations): A study revealed that both AI experts and non-experts exhibited “unwarranted faith in numbers” (numerical Q values for robot actions), perceiving them as signaling intelligence and potential actionability, even when their meaning was unclear. This demonstrates an EP where well-intentioned numerical transparency led to misplaced trust.
    • Seamful Design: A proposed HFE design philosophy that “strategically reveal relevant information that augments system understanding and conceal information that distracts.” This promotes reflective thinking by introducing “useful cognitive friction,” for example, through interactive counterfactual explanations (“what-if” scenarios).
    • Iterative Design and Stakeholder Engagement: Addressing EPs requires an “iterative approach that allows insights from evaluation to feedback to design,” involving “users as active partners” through participatory design methods.
    • Reframing AI Adoption: HFE advocates for a mindset shift from uncritical “acceptance-driven AI adoption” to “critical reflection,” ensuring AI is “worthy of our trust” and that users are aware of its capabilities and limitations. This resists the “move fast and break things” mentality.
    • Human-AI Relationship in Decision-Making: For high-stakes decisions, AI systems should be seen as “empowerment tools” where the human decision-maker retains responsibility and needs to “justify their decision to others.” XAI is key to making the AI’s role clear and building trust.
    • “Justification” vs. “Explanation”: Some differentiate explanation (understanding AI’s intrinsic processes) from justification (extrinsic information to support AI’s results, e.g., patient history, contrastive examples). Both are crucial for human decision-makers.
    • Mental Models: Effective human-AI collaboration relies on humans developing appropriate mental models of the AI system’s capabilities and limitations. XAI should facilitate this “human-AI onboarding process.”

    D. Practical Challenges in XAI Adoption and Solutions

    1. Catalog of Challenges (from Stack Overflow analysis):Model Integration Issues (31.07% prevalence): Difficulty embedding XAI techniques into ML pipelines, especially with complex models.
    2. Visualization and Plotting Issues (30.01% prevalence): Problems with clarity, interpretability, and consistency of visual XAI outputs.
    3. Compatibility Issues (20.36% prevalence): XAI techniques failing across different ML frameworks or hardware due to mismatches.
    4. Installation and Package Dependency Issues (8.14% prevalence): Difficulties in setting up XAI tools due to conflicts or poor documentation.
    5. Performance and Resource Issues (6.78% prevalence): High computational costs and memory consumption.
    6. Disagreement Issues (2.11% prevalence, but most severe): Conflicting explanations from different XAI methods.
    7. Data Transformation/Integration Issues (1.50% prevalence): Challenges in formatting or merging data for XAI models.
    • Perceived Severity vs. Prevalence: While Model Integration and Visualization/Plotting are most prevalent as technical hurdles, Disagreement Issues are perceived as the most severe by practitioners (36.54% rank highest), as they undermine trust and effective decision-making once tools are implemented.
    • Recommendations for Improvement: Practitioners prioritize:
    • Better Documentation and Tutorials (55.77% strongly agree): Clear, structured guides.
    • Clearer Guidance on Best Practices (48.07% strongly agree): Standardized methodologies.
    • Simplified Configuration and Setup (40.38% strongly agree): Easier onboarding.
    • User-Friendly Interfaces and Improved Visualization Tools: More intuitive and interactive tools.
    • Enhanced Integration with Popular ML Frameworks and Performance Optimization.
    • Addressing Disagreement and Consistency: Acknowledge disagreements and guide users in selecting reliable explanations.

    IV. Gaps and Future Directions

    • Lack of Standardization: XAI still lacks standardized definitions, metrics, and evaluation frameworks, hindering consistent assessment and comparison of methods.
    • Limited Empirical Validation: More situated and empirically diverse human-centered research is needed to understand stakeholder needs, how different user characteristics (e.g., expertise, background) impact susceptibility to EPs, and how explanations are appropriated in unexpected ways.
    • Beyond “Accuracy”: Future research should go beyond basic performance metrics to holistically evaluate human-AI relationships, including reliance calibration, trust, and understandability.
    • Taxonomy of EPs: Developing a taxonomy of explainability pitfalls to better diagnose and mitigate their negative effects.
    • Longitudinal Studies: Needed to understand the impact of time and repeated interaction on human-AI decision-making and trust dynamics.
    • Interdisciplinary Collaboration: Continued and enhanced collaboration among HFE, cognitive science, and AI engineering is crucial to develop frameworks that align AI decision-making with human cognitive and operational capabilities, and to address ethical and accountability challenges comprehensively.
    • Benchmarking for Responsible AI: Creation of benchmarks for various responsible AI requirements (ethics, privacy, security, explainability) to quantify their fulfillment.
    • “Human-in-the-loop”: Further development of this concept within responsible AI, emphasizing the human’s role in checking and improving systems throughout the lifecycle.
    • Trade-offs: Acknowledge and manage inherent trade-offs between different responsible AI aspects (e.g., robustness vs. explainability, privacy vs. accuracy).

    V. Conclusion

    The transition of AI from low-stakes to high-stakes domains necessitates a robust and human-centric approach to explainability. Current XAI research must evolve beyond purely technical considerations to embrace principles from Decision Theory and Human Factors Engineering. The development of frameworks like CIU and the rigorous evaluation of explanations as “means to an end” for specific decision tasks are critical steps. Addressing practical challenges identified by practitioners, especially the pervasive “disagreement problem” and the occurrence of “explainability pitfalls,” is paramount. Ultimately, achieving Responsible AI requires a dynamic, interdisciplinary effort that prioritizes human understanding, trust, and ethical considerations throughout the entire AI lifecycle, ensuring AI serves as an effective and accountable partner in human decision-making.

  • AI: Explainable Enough

    They look really juicy, she said. I was sitting in a small room with a faint chemical smell, doing one my first customer interviews. There is a sweet spot between going too deep and asserting a position. Good AI has to be just explainable enough to satisfy the user without overwhelming them with information. Luckily, I wasn’t new to the problem. 

    Nuthatcher atop Persimmons (ca. 1910) by Ohara Koson. Original from The Clark Art Institute. Digitally enhanced by rawpixel.

    Coming from a microscopy and bio background with a strong inclination towards image analysis I had picked up deep learning as a way to be lazy in lab. Why bother figuring out features of interest when you can have a computer do it for you, was my angle. The issue was that in 2015 no biologist would accept any kind of deep learning analysis and definitely not if you couldn’t explain the details. 

    What the domain expert user doesn’t want:
    – How a convolutional neural network works. Confidence scores, loss, AUC, are all meaningless to a biologist and also to a doctor. 

    What the domain expert desires: 
    – Help at the lowest level of detail that they care about. 
    – AI identifies features A, B, C, and that when you see A, B, & C it is likely to be disease X. 

    Most users don’t care how a deep learning really works. So, if you start giving them details like the IoU score of the object detection bounding box or if it was YOLO or R-CNN that you used their eyes will glaze over and you will never get a customer. Draw a bounding box, heat map, or outline, with the predicted label and stop there. It’s also bad to go to the other extreme. If the AI just states the diagnosis for the whole image then the AI might be right, but the user does not get to participate in the process. Not to mention regulatory risk goes way up.

    This applies beyong images, consider LLMs. No one with any expertise likes a black box. Today, why do LLMs generate code instead of directly doing the thing that the programmer is asking them to do? It’s because the programmer wants to ensure that the code “works” and they have the expertise to figure out if and when it goes wrong. It’s the same reason that vibe coding is great for prototyping but not for production and why frequent readers can spot AI patterns, ahem,  easily.  So in a Betty Crocker cake mix kind of way, let the user add the egg. 

    Building explainable-enough AI takes immense effort. It actually is easier to train AI to diagnose the whole image or to give details. Generating high-quality data at that just right level is very difficult and expensive. However, do it right and the effort pays off. The outcome is an AI-Human causal prediction machine. Where the causes, i.e. the median level features, inform the user and build confidence towards the final outcome. The deep learning part is still a black box but the user doesn’t mind because you aid their thinking. 

    I’m excited by some new developments like REX which sort of retro-fit causality onto usual deep learning models. With improvements in performance user preferences for detail may change, but I suspect that need for AI to be explainable enough will remain. Perhaps we will even have custom labels like ‘juicy’.

    Fediverse Reactions
  • My Road to Bayesian Stats

    By 2015, I had heard of Bayesian Stats but didn’t bother to go deeper into it. After all, significance stars, and p-values worked fine. I started to explore Bayesian Statistics when considering small sample sizes in biological experiments. How much can you say when you are comparing means of 6 or even 60 observations? This is the nature work at the edge of knowledge. Not knowing what to expect is normal. Multiple possible routes to a seen a result is normal. Not knowing how to pick the route to the observed result is also normal. Yet, our statistics fails to capture this reality and the associated uncertainties. There must be a way I thought. 

    Free Curve to the Point: Accompanying Sound of Geometric Curves (1925) print in high resolution by Wassily Kandinsky. Original from The MET Museum. Digitally enhanced by rawpixel.

    I started by searching for ways to overcome small sample sizes. There are minimum sample sizes recommended for t-tests. Thirty is an often quoted number with qualifiers. Bayesian stats does not have a minimum sample size. This had me intrigued. Surely, this can’t be a thing. But it is. Bayesian stats creates a mathematical model using your observations and then samples from that model to make comparisons. If you have any exposure to AI, you can think of this a bit like training an AI model. Of course the more data you have the better the model can be. But even with a little data we can make progress. 

    How do you say, there is something happening and it’s interesting, but we are only x% sure. Frequentist stats have no way through. All I knew was to apply the t-test and if there are “***” in the plot, I’m golden. That isn’t accurate though. Low p-values indicate the strength of evidence against the null hypothesis. Let’s take a minute to unpack that. The null hypothesis is that nothing is happening. If you have a control set and do a treatment on the other set, the null hypothesis says that there is no difference. So, a low p-value says that it is unlikely that the null hypothesis is true. But that does not imply that the alternative hypothesis is true. What’s worse is that there is no way for us to say that the control and experiment have no difference. We can’t accept the null hypothesis using p-values either. 

    Guess what? Bayes stats can do all those things. It can measure differences, accept and reject both  null and alternative hypotheses, even communicate how uncertain we are (more on this later). All without making assumptions about our data.

    It’s often overlooked, but frequentist analysis also requires the data to have certain properties like normality and equal variance. Biological processes have complex behavior and, unless observed, assuming normality and equal variance is perilous. The danger only goes up with small sample sizes. Again, Bayes requires you to make no assumptions about your data. Whatever shape the distribution is, so called outliers and all, it all goes into the model. Small sample sets do produce weaker fits, but this is kept transparent. 

    Transparency is one of the key strengths of Bayesian stats. It requires you to work a little bit harder on two fronts though. First you have to think about your data generating process (DGP). This means how do the data points you observe came to be. As we said, the process is often unknown. We have at best some guesses of how this could happen. Thankfully, we have a nice way to represent this. DAGs, directed acyclic graphs, are a fancy name for a simple diagram showing what affects what. Most of the time we are trying to discover the DAG, ie the pathway of a biological outcome. Even if you don’t do Bayesian stats, using DAGs to lay out your thoughts is a great. In Bayesian stats the DAGs can be used to test if your model fits the data we observe. If the DAG captures the data generating process the fit is good, and not if it doesn’t. 

    The other hard bit is doing analysis and communicating the results. Bayesian stats forces you to be verbose about your assumptions in your model. This part is almost magicked away in t-tests. Frequentist stats also makes assumptions about the model that your data is assumed to follow. It all happens so quickly that there isn’t even a second to think about it. You put in your data, click t-test and woosh! You see stars. In Bayesian stats stating the assumptions you make in your model (using DAGs and hypothesis about DGPs) communicates to the world what and why you think this phenomenon occurs. 

    Discovering causality is the whole reason for doing science. Knowing the causality allows us to intervene in the forms of treatments and drugs. But if my tools don’t allow me to be transparent and worse if they block people from correcting me, why bother?

    Richard McElreath says it best:

    There is no method for making causal models other than science. There is no method to science other than honest anarchy.

    Fediverse Reactions
  • The secret flag of content

    I don’t have any fun when I use LLMs to write. It may have perceived utility: popping out a LinkedIn article or two everyday. But I bet no one is actually reading. It’s a strip mall for a thumb stroll.  

    LLMs suck at writing. The summaries that LLMs give with the “Deep Research” are so poor in quality that I start to skim it. Yes, I skim the thing that is already a summary. 

    Let’s do a small experiment using the  Harry G. Frankfurt Essay On Bullshit

    Here are HGF’s opening lines: 

    “One of the most salient features of our culture is that there is so much bullshit. Everyone knows this. Each of us contributes his share.”

    Now here is ChatGPT o3, with the advantage of having HGF in the training data and extra contextualizing from me “theory of bullshit”: 

    Prompt: write me an introductory paragraph on the theory of bullshit 

     Result: “Bullshit isn’t just lying gone sloppy; it’s a distinct communicative strategy built on indifference to truth. Where the liar knows the facts and works to conceal or invert them, the bullshitter treats truth-value as irrelevant raw material—selecting, mixing, or inventing claims to achieve an effect (status, alignment, vibe) rather than to inform. The result is epistemic pollution: signals that…”

    Look! It’s the secret flag of “content”:  “Blah isn’t just X; Blah is Y”.

    AI generated flag. As dead as the text on it is wrong.

    I cannot assimilate this, it definitely can’t help me synthesize anything. There is always better human writing even if it is poorly written. It has a certain aliveness, and that aliveness is contagious enthusiasm. Vibe coding I can understand, you can see bad output in the final result. Vibe coding still manages to change something in my mind: knowing what I don’t want. 

    I don’t mind people using AI at all, I use it alllll the time. Writing with LLMs is no just fun. All this prompting and almost nothing changes in my mind. When an AI rearranges your thoughts it does not rearrange your brain. 

    Fediverse Reactions