Author: Aneesh Sathe

  • Jan 14, 2025

    Photo: by me, Aneesh Sathe, Malaysia, 2011

    Font fight!

    The fonts have assembled, their ligatures sharp their curves are shiny. They all line up with perfect kerning… or do they? Your eyes and mind are their battlefield.

    https://www.codingfont.com/

    I got JetBrains mono btw.


    Decentralization is just partial centralization

    Renée DiResta writes about the social media flux in Noema.

    Decentralization places a heavy burden on individual instance administrators, mostly volunteers, who may lack the tools, time or capacity to address complex problems effectively.

    Identity verification is another weak point, leading to impersonation risks that centralized platforms typically manage more effectively. Inconsistent security practices between servers can allow malicious actors to exploit weaker links.

    While all this is fine, I have a completely different view about the ongoings around social media. I’d rather completely quit something than go through the pain of sorting through and tuning the place just right.

    The internet has infinite space. Make your own blog, follow people you like (RSS feeds still work!) and ignore those you don’t. Nostalgic about back when Twitter was good? Well there was a time when the internet was good. It was good because the people with access created little gardens of their own (not just of the digital garden variety, but those too). Psst… it’s still good btw, the social media blinds you.

    While I’m a staunch early adopter I’m also an early abandoner. The only thing I’ve been unable to abandon is blogs. I’ve never felt like it’s was better to shut myself inside a walled garden, but I would suffocate if I weren’t able to surf, what a wonderful word that is, the internet.


    Obsessing is happiness

    My happiest times have been when I was completely consumed by some task or project for days on end. I’ve learned hydropinics and grown an ungodly amount of mint in the Singapore Sun. I’ve made terrible mead, taught myself programming, then machine learning… countersteering a motorcycle? You should watch me lean.

    All this to say that happiness is an entirely oblique activity. This was crystallized for me in this post about Betrand Russell’s Conquest of Happiness by the wholly awesome Maria Popova


    The 1900s are here

    Every 24th frame of Stanley Kubrick’s masterpiece 2001: A Space Odyssey posted once an hour

    That cools projects like these exist is a testament to the eternal cool of the Internet. As I write, the bot is on its 1,899th hour. The exciting 1900s are coming up.


    If you are new to Bayesian Stats start here


  • Jan 13, 2025 Life, platforms, vectors

    Crystal-Bison

    I don’t want to give away much of the poem but it captures the nature, ferocity, and purpose of life.


    Ghost Exits

    Aligning with the idea that blogs will be the last of the good internet there is a broader question about platforms and their methods. Long before meta and X abandoned all pretense the internet was already under attack while we were believed this was fine.

    We need more (and better) institutions and fewer platforms, and the latter have flourished at the expense of the former, advancing a specific agenda under an apolitical guise…

    We depend on those platforms more because our institutions have weakened. The present arrangement was far from inevitable and need not be permanent.

    …[Platforms have an] inherent tendency to extract value from their users and seek growth, while presenting the whole arrangement as a utopia.


    Covid 5 years later

    Speaking of this being all fine, the WHO had a 4 day conference about COVID. There is more research than people can read but

    Despite the flood of insights into the behavior of the virus and how to prevent it from causing harm, many at the meeting worried the world has turned a blind eye to the lessons learned from the pandemic.

    One of those black holes in history, we seem to not be able to peer beyond the event horizon.

    Virologist Jesse Bloom of the Fred Hutchinson Cancer Center, who is not convinced the pandemic began at the market and has urged colleagues to remain open to the possibility of a lab leak […] “There’s still little actual information about the first human cases,” Bloom says. “There’s just not a lot of knowledge about what was really going on in Wuhan in late 2019.”

    Against the backdrop of the world pretending everything is going back to normal, one group, virologists remain under attack.

    the world is dropping its guard against novel pathogens. Infectious disease is “not a safe space to really be working in,” she told Science. “Labs have been threatened. People have been threatened. Governments don’t necessarily want to be the ones to say, ‘Hey, we found something new.’”


    To my tiny set of newsletter subscribers: HI! 👋

  • The Universal Library in the River of Noise


    Few ideas capture the collective human imagination more powerfully than the notion of a “universal library”—a singular repository of all recorded knowledge. From the grandeur of the Library of Alexandria to modern digital initiatives, this concept has persisted as both a philosophical ideal and a practical challenge. Miroslav Kruk’s 1999 paper, “The Internet and the Revival of the Myth of the Universal Library,” revitalizes this conversation by highlighting the historical roots of the universal library myth and cautioning against uncritical technological utopianism. Today, as Wikipedia and Large Language Models (LLMs) like ChatGPT emerge as potential heirs to this legacy, Kruk’s insights—and broader reflections on language, noise, and the very nature of truth—resonate more than ever.


    The myth of the universal library

    Humanity has longed for a comprehensive archive that gathers all available knowledge under one metaphorical roof. The Library of Alexandria, purportedly holding every important work of its era, remains our most enduring symbol of this ambition. Later projects—such as Conrad Gessner’s Bibliotheca Universalis (an early effort to compile all known books) and the Enlightenment’s encyclopedic endeavors—renewed the quest for total knowledge. Francis Bacon famously proposed an exhaustive reorganization of the sciences in his Instauratio Magna, once again reflecting the aspiration to pin down the full breadth of human understanding.

    Kruk’s Historical Lens  

    This aspiration is neither new nor purely technological. Kruk traces the “myth” of the universal library from antiquity through the Renaissance, revealing how each generation has grappled with fundamental dilemmas of scale, completeness, and translation. According to Kruk,

    inclusivity can lead to oceans of meaninglessness

    The library on the “rock of certainty”… or an ccean of doubt?

    Alongside the aspiration toward universality has come an ever-present tension around truth, language, and the fragility of human understanding. Scholars dreamed of building the library on a “rock of certainty,” systematically collecting and classifying knowledge to vanquish doubt itself. Instead, many found themselves mired in “despair” and questioning whether the notion of objective reality was even attainable. As Kruk’s paper points out,

    The aim was to build the library on the rock of certainty: We finished with doubting everything … indeed, the existence of objective reality itself.”

    Libraries used to be zero-sum

    Historically,

    for some libraries to become universal, other libraries have to become ‘less universal.’

    Access to rare books or manuscripts was zero-sum; a collection in one part of the world meant fewer resources or duplicates available elsewhere. Digitization theoretically solves this by duplicating resources infinitely, but questions remain about archiving, licensing, and global inequalities in technological infrastructure.


    Interestingly, Google was founded the same year as Kruk’s 1999 paper was nearing publication. In many ways, Google’s search engine became a “library of the web,” indexing and ranking content to make it discoverable on a scale previously unimaginable. Yet it is also a reminder of how quickly technology can outpace our theoretical frameworks: Perhaps Kruk couldn’t have known about Google without Google. Something something future is already here…

    Wikipedia: an oasis island

    Wikipedia stands as a leading illustration of a “universal library” reimagined for the digital age. Its open, collaborative platform allows virtually anyone to contribute or edit articles. Where ancient and early modern efforts concentrated on physical manuscripts or printed compilations, Wikipedia harnesses collective intelligence in real time. As a result, it is perpetually expanding, updating, and revising its content.

    Yet Kruk’s caution holds: while openness fosters a broad and inclusive knowledge base, it also carries the risk of “oceans of meaninglessness” if editorial controls and quality standards slip. Wikipedia does attempt to mitigate these dangers through guidelines, citation requirements, and editorial consensus. However, systemic biases, gaps in coverage, and editorial conflicts remain persistent challenges—aligning with Kruk’s observation that inclusivity and expertise are sometimes at odds.

    LLMs – AI slops towards the perfect library

    Where Wikipedia aspires to accumulate and organize encyclopedic articles, LLMs like ChatGPT offer a more dynamic, personalized form of “knowledge” generation. These models process massive datasets—including vast portions of the public web—to generate responses that synthesize information from multiple sources in seconds. In a way this almost solves one of the sister aims of the perfect library, perfect language, where the embeddings serve as a stand in for perfect words.

    The perfect language, on the other hand, would mirror reality perfectly. There would be one exact word for an object or phenomenon. No contradictions, redundancy or ambivalence.


    The dream of a perfect language has largely been abandoned. As Umberto Eco suggested, however, the work on artificial intelligence may represent “its revival under a different name.” 

    The very nature of LLMs highlights another of Kruk’s cautions: technological utopianism can obscure real epistemological and ethical concerns. LLMs do not “understand” the facts they present; they infer patterns from text. As a result, they may produce plausible-sounding but factually incorrect or biased information. The quantity-versus-quality dilemma thus persists.

    Noise is good actually?

    Although the internet overflows with false information and uninformed opinions, this noise can be generative—spurring conversation, debate, and the unexpected discovery of new ideas. In effect, we might envision small islands of well-curated information in a sea of noise. Far from dismissing the chaos out of hand, there is merit in seeing how creative breakthroughs can emerge from chaos. Gold of Chemistry from leaden alchemy.

    Concerns persist, existence of misinformation, bias, AI slop invites us to exercise editorial diligence to sift through the noise productively. It also echoes Kruk’s notion of the universal library as something that “by definition, would contain materials blatantly untrue, false or distorted,” thus forcing us to navigate “small islands of meaning surrounded by vast oceans of meaninglessness.”

    Designing better knowledge systems

    Looking forward, the goal is not simply to build bigger data repositories or more sophisticated AI models, but to integrate the best of human expertise, ethical oversight, and continuous quality checks. Possible directions include:

    1. Strengthening Editorial and Algorithmic Oversight:

    • Wikipedia can refine its editorial mechanisms, while AI developers can embed robust validation processes to catch misinformation and bias in LLM outputs.

    2. Contextual Curation:  

    • Knowledge graphs are likely great bridges between curated knowledge and generated text

    3. Collaborative Ecosystems:  

    • Combining human editorial teams with AI-driven tools may offer a synergy that neither purely crowdsourced nor purely algorithmic models can achieve alone. Perhaps this process could be more efficient by adding a knowledge base driven simulation (see last week’s links) of the editors’ intents and purposes.

    A return to the “raw” as opposed to social media cooked version of the internet might be the trick afterall. Armed with new tools we can (and should) create meaning. In the process Leibniz might get his universal digital object identifier after all.

    Compression progress as a fundamental force of knowledge

    Ultimately, Kruk’s reminder that the universal library is a myth—an ideal rather than a finished product—should guide our approach. Its pursuit is not a one-time project with a definitive endpoint; it is an ongoing dialogue across centuries, technologies, and cultures. As we grapple with the informational abundance of the digital era, we can draw on lessons from Alexandria, the Renaissance, and the nascent Internet of the 1990s to inform how we build, critique, and refine today’s knowledge systems.

    Refine so that tomorrow, maybe literally, we can run reclamation projects in the noisy sea.


    Image: Boekhandelaar in het Midden-Oosten (1950 – 2000) by anonymous. Original public domain image from The Rijksmuseum

  • Jan 11, 2025 – Leading with Kindness


    Leading with Kindness

    PDF kindly made available by the author, Steve Swensen – via Helen Bevan on Bluesky.

    Steve Swensen held leadership positions at Mayo Clinic ensuring not just improving care but also preventing burnout. This paper from May 2024 provides leaders with a framework to help colleagues do better and “Kindness is helping people do better.”

    That colleague or work environment that creates stress and anxiety have a very real impact on your health and long-term well-being. An organization can improve team health by creating space for “nurturing human conditions” to emerge:

    • Agency is the capacity of individuals or
      teams to act independently.
    • Collective effervescence is the sense
      of meaning, community spirit, energy,
      invigoration and harmony people feel
      when they come together in groups with
      a shared purpose.62
    • Camaraderie is a multidimensional
      combination of social connectedness,
      teamwork, respect, authenticity,
      appreciation, loyalty and recognition
      of each other’s mattering. It is about
      belonging.
    • Positivity is choosing a disposition
      of optimism and positive affect with
      a mindset that sees opportunities for
      learning, abundance and possibility in
      the world.

    The paper being primarily a systems paper provides 10 systems to lower stress and increase resilience:

    Steve deep dives into each of these. Below are some practices that I have experienced or conditions I’ve strived to create:

    • Promoting Agency by asking the team how to improve, prioritizing where to focus and empowering the team to execute on the opportunities.
    • Ikigai and lifecrafting: Creating space and giving opportunities for people work on personally meaningful work.
    • Having lunch or coffee as a team 🙂
    • Pushing the decisions of work-life balance down to the people that actually have to live with the choices. Colleagues are adults and providing them with agency in these matters creates psychological safety.

    Five kindness behaviours: Leader behaviours that reduce emotional exhaustion and engender satisfaction.

    • Seek to understand
      • Solicit input from colleagues with humility
    • Appreciate
      • Recognise associates with authentic gratitude
    • Mentor
      • Nurture and support coworker aspirations
    • Foster belonging
      • Welcome everyone with respect and
        acceptance
    • Be transparent
      • Communicate openly for collective decisions

    Other references to Steve’s work:

    [Video] The Mayo Clinic model of care.

    Framework to Reduce Professional Burnout – [PDF] via linkedin.


    Image: The Harbinger of Autumn (1922) by Paul Klee.


  • Jan 10, 2025 – AI Agents, Machiavelli’s Study

    Agents Are Not Enough

    Last year I was heavily experimenting with Knowledge Graphs because it’s been clear that LLMs by themselves fall short because of the lack of knowledge. This paper by Chirag Shah and Ryen White (you can click the heading above) from Dec 2024 expands on those shortcomings by exploring not just knowledge but also value generation, personalization, and trust.

    They open the paper by casting a very wide definition of an “agent” everything from thermostats to LLM tools. While this seems facetious at first, their next point is interesting. Agents by definition “remove agency from a user in order to do things on the user’s behalf and save them time and effort.”. I think this is an interesting way to injext an LLM flavored principal agent problem into the Agentic AI conversation.

    Their broad suggestion is to expand the ecosystem of agents by including “Sims”. Sims are simulations of the user which address

    • privacy and security
    • automated interactions and
    • representing the interests of the user by holding intimate knowledge about the user

    It’s a short easy read, if you have 10 min.


    Machiavelli and the Emergence of the Private Study

    Infinite knowledge is available through the internet today. It is available trivially and, some, ahem, blogs make a performance of consuming it. Machiavelli used to

    put on the garments of court and palace. Fitted out appropriately, I step inside the venerable courts of the ancients, where, solicitously received by them, I nourish myself on that food that alone is mine and for which I was born, where I am unashamed to converse with them and to question them about the motives for their actions, and they, in their humanity, answer me. And for four hours at a time I feel no boredom, I forget all my troubles, I do not dread poverty, and I am not terrified by death. I absorb myself into them completely.

    Some folks have a private office, but an office is not a study. A study or, studiolo

    in Italian, a precursor to the modern-day study — came to offer readers access to a different kind of chamber, a personal hideaway in which to converse with the dead. Cocooned within four walls, the studiolo was an aperture through which one could cultivate the self. After all, to know the world, one must begin with knowing the self, as ancient philosophy instructs. In order to know the self, one ought to study other selves too, preferably their ideas as recorded in texts. And since interior spaces shape the inward soul, the studiolo became a sanctuary and a microcosm. The study thus mediates the world, the word, and the self.

    In the 1500s Michel de Montaigne writes:

    We should have wife, children, goods, and above all health, if we can; but we must not bind ourselves to them so strongly that our happiness [tout de heur] depends on them. We must reserve a back room [une arriereboutique] all our own, entirely free, in which to establish our real liberty and our principal retreat and solitude.

    A little later, Virginia Woolf points out what seems to be an eternal inequality by struggling to find “a room of one’s own”.

    The enclosure of the study, for those of us lucky to have one, offers us a paradoxical sort of freedom. Conceptually, the studiolo is a pharmakon, a cure or poison for the soul. In its highest aspirations, the studiolo, as developed by humanists from Petrarch to Machiavelli to Montaigne, is a sanctuary for self-cultivation. Bookishness was elevated into a saintly virtue

    The world today would perhaps be better off if more of us had our own studiolos.


    Image source

    Fediverse Reactions
  • Jan 9, 2025 – The Age of Fire and Gravel

    Today I discovered the Public Domain Image Archive attached to the Public Domain Review which has some great essays and commentary on image collections, like the one below.


    Utagawa Hiroshige: Last Great Master of Ukiyo-e

    Just before Hiroshige died, possibly of cholera, he wrote the following poem:

    I leave my brush in the East
    And set forth on my journey.
    I shall see the famous places in the Western Land.

    This was a fortelling of sorts because Hiroshige

    was a hugely influential figure, not only in his homeland but also on Western painting. Towards the end of the nineteenth century, as a part of the trend in “Japonism”, European artists such as Monet, Whistler, and Cézanne, looked to Hiroshige’s work for inspiration, and a certain Vincent van Gogh was known to paint copies of his prints.


    Los Angeles Burns

    Driven by global climate change, the Santa Ana “Devil” winds were intense this year parching the land.

    Part of the reason for spread was the rerouting of funds away from the fire department which reduced the ability to respond appropriately.

    Besides the immense personal damage, NASA’s JPL was shut down.

    Though not all happy memories, LA was home once. It’s where I discovered my love bio bio and computers, not to mention the tremendous library system and the Tar Pits.


    Rethinking Dose-Response Analysis with Bayesian Inference

    Analyzing dose-response experiments is tricky and the standard Marquardt-Levenberg algorithm “does not evaluate the certainty (or uncertainty) of the estimates nor does it allow for the statistical comparison of two datasets,”. This can lead to biased conclusions as a lot of subjectivity and wishful thinking has scope to creep in.

    A Bayesian Approach?

    The authors propose a Bayesian inference methodology that addresses these limitations. This approach can “characterize the noise of dataset while inferring probable values distributions for the efficacy metrics,”.

    • It also allows for the statistical comparison of two datasets and can “compute the probability that one value is greater than the other”.
    • Critically, it incorporates prior knowledge (and intution) through prior distributions: “The model incorporates the notion of intuition through prior distributions and computes the most probable value distribution for each of the efficacy metrics”.

    Beyond Single Point Estimates

    This method moves beyond single-point estimates, which can be misleading, and “explicitly quantifies the reliability of the efficacy metrics taking into account the noise over the data”. The goal is to help researchers “analyze and interpret dose–response experiments by including uncertainty in their reasoning and providing them a simple and visual approach to do so”.

    So What?

    By “considering distributions of probable values instead of single point estimates,” this Bayesian approach provides a more robust interpretation of your data. This is particularly important when dealing with noisy or unresponsive datasets, which can often occur in drug discovery.

    Though their demo is offline, the code is available.


    This was on repeat today


    Image source: Book Cover: Ignatius Donnelly. Ragnarok: The Age of Fire and Gravel. New York, D. Appleton and Company, 1883

  • Jan. 8, 2025: Count your DIGITS! Drunk Bayesian

    NVIDIA Project DIGITS

    Around 2015 I was putting together funds in academia. Convincing IT, senior professors, and finance that yes, it was worth giving me a LOT of cash to build a workstation with multiple GPUs.


    “No, it isn’t for gaming.”

    “Yes, it will change the world.”

    “No, there are no university rules that hardware bought multiple invoices across multiple departments can’t be used in the same box.”

    “Yes, I’m aware that all my individual quotes are just below the bureaucracy summoning purchase limits.”

    “Yes I tried random forest with the other stats and ML methods, this really is better. How do I know? Well…”

    Deep Learning was taking off, but in the biotech world it was seen as a regular tech update.

    I did get the money and built that loud jet engine sounding monster. Among the first software I installed was DIGITS (freshly archived), this was just as keras had it’s release and all I wanted to do was to build a neural network. That little step changed my life.

    Today NVIDIA announced hardware also named DIGITS with the claimed performance (or ever near there) it will be as life changing for the early explorers as the DIGITs software was for me.

    The $3000 price tag is probably much better justified than it was to another little piece of hardware from last year. I hope to get my hands on a couple of these not just for LLMs but for my first love, images.

    From the press release:

    GB10 Superchip Provides a Petaflop of Power-Efficient AI Performance
    The GB10 Superchip is a system-on-a-chip (SoC) based on the NVIDIA Grace Blackwell architecture and delivers up to 1 petaflop of AI performance at FP4 precision.
    GB10 features an NVIDIA Blackwell GPU with latest-generation CUDA® cores and fifth-generation Tensor Cores, connected via NVLink®-C2C chip-to-chip interconnect to a high-performance NVIDIA Grace™ CPU, which includes 20 power-efficient cores built with the Arm architecture. MediaTek, a market leader in Arm-based SoC designs, collaborated on the design of GB10, contributing to its best-in-class power efficiency, performance and connectivity.
    The GB10 Superchip enables Project DIGITS to deliver powerful performance using only a standard electrical outlet. Each Project DIGITS features 128GB of unified, coherent memory and up to 4TB of NVMe storage. With the supercomputer, developers can run up to 200-billion-parameter large language models to supercharge AI innovation. In addition, using NVIDIA ConnectX® networking, two Project DIGITS AI supercomputers can be linked to run up to 405-billion-parameter models.

    Technically it’s not a proper petaflop at FP4 precision, but I’m ok with that kind of impropriety.


    Drunk Bayesians and the standard errors in their interactors

    This video is from 2018 and of the interesting things Gelman touches on is the ability of AI to do model fitting automatically. Gelman argues that an AI designed to perform statistical analysis would also need to deal with the implications of Cantor’s diagonal.

    Essentially, new models need to be built when new data don’t fit the old model. You go down the diagonal of more data vs increasing model complexity.

    This means that an AI cannot have a full model ahead of time, and it must have different modules and an executive function and it must make mistakes. He suggests that AI needs to be more like a human, with analytical and visual modules, and an executive function, rather than a single, monolithic program

    Perhaps we aren’t quite there yet but the emerging agentic methods are looking promising in light of his thoughts.

    Some cool quotes below that I hope encourage you to watch this longish but lively window into one of the mind of one of the best known Bayesian statisticians.

    -> I think that there’s something about statistics in the philosophy of statistics which is inherently untidy

    -> …in Bayes you do inference based on the model and that’s codified and logical but then there’s this other thing we do which is we say our inferences don’t make sense we reject our model and we need to change it

    -> Our statistical models have to be connected to our understanding of the world

    On the reproducibility crisis in science in three parts:

    Studies not replicating: This is the most obvious part of the crisis, where a study’s findings are not supported when the study is repeated with new data. This casts doubt on the original claims and the related literature and invalidate a whole sub-field

    Problems with statistics: Gelman argues that some studies do not do what they claim to do, often due to errors in statistical analysis or research methods. He gives the example of a study about male upper body strength that actually measured the fat on men’s arms.

    Three fundamental problems of statistics:

    • generalizing from a sample to a population
    • generalizing from a treatment to a control group, and
    • generalizing from measurements to underlying constructs of interest

    This last point is particularly interesting in the biotech space. Which brings us to,

    Problems with substantive theory: Many studies lack a strong connection between statistical models and the real world. A better understanding of mechanisms and interactions is necessary for more accurate inferences. Gelman also discusses the “freshman fallacy,” where researchers assume a random sample of the population is not needed for causal inference, when in fact, it is crucial if the treatment effect varies among people (especially important if you are trying to discover drugs!). He further notes that the lack of theory and mechanisms lead to not being able to estimate interactions, which are crucial.

    There are many more topics he covers from p-values and economics to bayesians not being bayesian enough.


    As thanks for providing the source of the snowclone title here’s some Static


    Image: Vintage European style abacus engraving from New York’s Chinatown. A historical presentation of its people and places by Louis J. Beck (1898). Original from the British Library. Digitally enhanced by rawpixel.

  • Jan. 7, 2025: Building Dwelling Thinking

    Today’s product builders and data scientists shape the way people see the world. The analysis, plots, and UI we create are places where others dwell. Not merely occupy but live and harness the mental space we give them access to. Martin Heiddeger wrote Building Dwelling Thinking(archive.org) in his 1971 book, Poetry Language Thought.

    Man’s relation to locations, and through locations to spaces, inheres in his dwelling. The relationship between man and space is none other than dwelling, strictly thought and spoken

    What Heidegger calls locations, I have thought of as places. Places are anything with order and purpose, as thought of by the user. Spaces are not so much the mathematical or physics concepts but more like domains ones. Such as the AI-space, or biotech-space. These spaces come into being as a result of thought extended from the boundaries of places to the explorations enabled by the affordances of said spaces.

    A boundary is not that at which something stops but, as the Greeks recognized, the boundary is that from which something begins its presencing.
    […]
    The location admits the [space] and it installs the [space].

    Heidegger addresses thinking only very briefly and from a distance. In the life of today’s knowledge worker thinking is everything. For the knowledge worker to be able to “dwell” they must be able to bring together the act of thinking and building. This is why good visualization, analysis that reveals rather than hides, and products that expand rather than limit the user’s ability are important.

    Building and thinking are, each in its own way, inescapable for dwelling. The two, however, are also insufficient for dwelling so long as each busies itself with its own affairs in separation instead of listening to one another. They are able to listen if both building and thinking-belong to dwelling, if they remain within their limits and realize that the one as much as the other comes from the workshop of long experience and incessant practice.

    To be able to free the user is critical. Everyone has their expertise and it us usually not in using your product. To make your place so convoluted that the user has to conform and constrict to be able to use it is not kind placemaking. At the beginning of the essay there is a definition of what it means “to free”

    To free really means to spare. The sparing itself consists not only in the fact that we do not harm the one whom we spare. Real sparing is something positive and takes place when we leave something beforehand in its own nature, when we return it specifically to its being, when we “free” it in the real sense of the word into a preserve of peace. To dwell, to be set at peace, means to remain at peace within the free sphere that safeguards each thing in its nature. The fundamental character of dwelling is this sparing and preserving

    My takeaway is that whenever a product/place is built it’s primary concern should be the freedom of the person expected to dwell there. The freedom you provide enables them to explore spaces they care about.


    Image credit: Nagoya Castle (ca.1932) print in high resolution by Hiroaki Takahashi. Original from The Los Angeles County Museum of Art. Digitally enhanced by rawpixel.

  • Jan. 6, 2025


    VMC: A Grammar for Visualizing Statistical Model Checks

    Data Scientists check how well a statistical model fits observed data with numerical and graphical checks. Graphical checks have a huge range outside of the well knowns like Q-Q plots. Scientists are of course limited by their training and experience, and it’s not trivial to arrive at effective model checks. Both programmatic and visual plotting tools require significant effort to generate new plots increasing the friction to do proper checks.
    Work out of Jessican Hullman‘s lab has created the VMC package (github) is a tool to easily access these methods and determine the quality of your model quickly. VMC is

    a high-level declarative grammar for generating model check visualizations. VMC categorizes design choices in model check visualizations via four components: sampling specification, data transformation, visual representation(s), and comparative layout. VMC improves the state-of-the-art in graphical model check specification intwo ways:
    (1) it allows users to explore a wide range of model checks through relatively small changes to a specification as opposed to more substantial code restructuring, and
    (2) it simplifies the specification of model checks by defining a small number of semantically-meaningful design components tailored to model checking.

    The work comes from a thoughtful place aiming not just to help out statisticians but to properly address and solve design considerations of a good tool, extending the wonderful familiy of tools that is ggplot2 built on the Grammar of Graphics.

    Visualizations play a critical role in validating and improving statistical models. However, the design space of model check visualizations is not well understood, making it difficult for authors to explore and specify effective graphical model checks. VMC defines a model check visualization using four components:
    1. samples of distributions of checkable quantities generated from the model, including predictive distributions for new data and distributions of model parameters;
    2. transformations on observed data to facilitate comparison;
    3. visual representations of distributions;
    4. layouts to facilitate comparing model samples and observed data.


    Pat Metheny: MoonDial

    The central vibe here is one of resonant contemplation. This guitar allows me to go deep. Deep to a place that I maybe have never quite gotten to before. This is a dusk-to-sunrise record, hard-core mellow.

    I have often found myself as a listener searching for music to fill those hours, and honestly, I find it challenging to find the kinds of things I like to hear. As much “mellow” music as there is out there, a lot of it just doesn’t do the thing for me.

    This record might offer something to the insomniacs and all-night folks looking for the same sounds, harmonies, spirits, and melodies that I was in pursuit of during the late nights and early mornings that this music was recorded.

    The above is from Pat’s website. I discovered Pat Metheny relatively recently and have grown to like his music. Last year he released MoonDial which I picked up last week. It’s nice.

    Check it out:

    While I know nothing about musical instruments, the man is a proper geek:

    Some years back, I had asked Linda Manzer, one of the best luthiers on the planet and one of my major collaborators, to build me yet another acoustic Baritone guitar, but this time one with nylon strings as opposed to the steel string version that I had used on the records One Quiet Night and What’s It All About.

    My deep dive into the world of Baritone guitar began when I remembered that as a kid in Missouri, a neighbor had shown me a unique way of stringing where the middle two strings are tuned up an octave while the general tuning of the Baritone instrument remains down a 4th or a 5th. This opened up a dimension of harmony that had been previously unavailable to me on any conventional guitar.

    There were never really issues with Linda’s guitar itself, but finding nylon strings that could manage that tuning without a) breaking or b) sounding like a banjo – was difficult.

    Just before we hit the road, I ran across a company in Argentina (Magma) that specialized in making a new kind of nylon string with a tension that allowed precisely the sound I needed to make Linda’s Baritone guitar viable in my special tuning.


    Lake bacteria evolve like clockwork with the seasons

    This article covers a pair of studies on bacteria and viruses in a lake.

    researchers found that over the course of a year, most individual species of bacteria in Lake Mendota rapidly evolve, apparently in response to dramatically changing seasons.

    Gene variants would rise and fall over generations, yet hundreds of separate species would return, almost fully, to near copies of what they had been genetically prior to a thousand or so generations of evolutionary pressures.

    From the preprint of the virus paper:

    In the evolutionary arms race between viruses and their hosts, “kill-the-winner” and other forms of dynamics frequently occur, causing fluctuations in the abundance of various viral strains55. Despite these fluctuations, certain viral species persist over extended periods and demonstrate high occurrence over time, indicating their evolutionary success in adapting to changing environmental conditions. These high occurrence viral species may represent a ‘royal family’ viral species in the model used to explain the “kill-the-winner” dynamics, where certain sub-populations with enhanced viral fitness have descendants that become dominant in subsequent “kill-the-winner” cycles. It is probable that these high occurrence viral species maintain a stable presence at the coarse diversity level while undergoing continuous genomic and physiological changes at the microdiversity level. The dynamics at the level of viral and host interactions play a pivotal role in driving viral evolution and maintaining the dominance of ‘royal family’ viral species.


    Image credit: Sitting cat, from behind (1812) drawing in high resolution by Jean Bernard. Original from the Rijksmuseum. Digitally enhanced by rawpixel.

  • Jan. 5, 2025

    Jan. 5, 2025

    Improving Research Through Safer Learning from Data

    Another one of Frank Harrell‘s posts. Given my day job, and the R&D background this one is quite close to home. As a team leader on the industry side one hopes to build a culture with the team that aligns scientific rigor with company goals. Any method, statistical or cultural (in this case both), that solves for this tension will get you the most bang for your buck.

    Building a startup, doing research, or even just launching a moonshot project is dependent on emergent actions and decisions. Behind this madness, there is a kind of Science of “Muddling Through” and Bayesian methods are perhaps best equipped:

    make all the assumptions you want, but allow for departures from those assumptions. If the model contains a parameter for everything we know we don’t know (e.g., a parameter for the ratio of variances in a two-sample t-test), the resulting posterior distribution for the parameter of interest will be flatter, credible intervals wider, and confidence intervals wider. This makes them more likely to lead to the correct interpretation, and makes the result more likely to be reproducible.

    In an environment of limited resources (time, money, investor patience), being able to quantify your data and obtaining bankable evidence is critical.

    only the Bayesian approach allows insertion of skepticism at precisely the right point in the logic flow, one can think of a full Bayesian solution (prior + model) as a way to “get the model right”, taking the design and context into account, to obtain reliable scientific evidence.

    The post provides a much more in-depth view, including an 8-fold path to enhancing the scientific process.


    Why I walk

    Chris Arnade gives us a view into why he goes on insanely long (both distance and time) trips by foot and what he discovers there. I found this passage hilarious, in that money seems to converge on a version of living that is identical with different paint while the rich likely search for that unique way of living.

    Every large global city has a few upscale neighborhoods that are effectively all the same. It is where the very rich have their apartments, the five and four star hotels are, the most famous museums, and a shopping district with the same stores you would find in Manhattan’s Upper East Side, or London’s Mayfair.

    The only difference is the branding. So you get the Upper East Side with Turkish affectations, or a Peruvian themed Mayfair. The residents of these neighborhoods are also pretty comfortable in any global city. As long as it is the right neighborhood.

    Having traveled a fair bit, the designation of tourist brings with it multiple horrors. Chris uses walking as a mini residing and side-steps much of it. :

    Walking also changes how the city sees you, and consequently, how you see the city. As a pedestrian you are fully immersed in what is around you, literally one of the crowd. It allows for an anonymity, that if used right, breaks down barriers and expectations. It forces you to deal with and interact with things and people as a resident does. You’re another person going about your day, rather than a tourist looking to buy or be sold whatever stuff and image a place wants to sell you.

    This particular experience reminded of the book Shantaram, a book about a foreigner being adsorbed to and absorbed by the local.


    AI chatbots fail to diagnose patients by talking with them

    While interesting the research uses LLMs to play both patient and doctor. I like that there is research happening in this area and at least we are moving in the right direction with the testing of AI i.e. not just structured tests. I would perhaps not judge the “failure” too harshly as the source of the data as also an LLM and would suffer from deficiencies which an actual patient suffering from symptoms would not.

    This paper introduces the Conversational Reasoning Assessment Framework for Testing in Medicine (CRAFT-MD) approach for evaluating clinical LLMs. Unlike traditional methods that rely on structured medical examinations, CRAFT-MD focuses on natural dialogues, using simulated artificial intelligence agents to interact with LLMs in a controlled environment.

    While the paper has a negative result we do come away with a good set of recommendations for future evaluations.

    Link to paper, research by Shreya Johori et.al. from the lab of Pranav Rajpurkar


    Fediverse Reactions