A Secure Operating System for Collective Intelligence

We stand today at an inflection point in the history of computing. Autonomous AI agents are being deployed at high speed, spawning powerful, increasingly multimodal systems, able to operate across email, documents, apps, codebases, the open web, and shared digital environments. This looks like the long-awaited transition from single-process software to multi-process computing: workflows that once required continuous human supervision are now being delegated to networks of semi-autonomous processes. We have been dreaming about this moment. If it wasn’t for a decisive difference.

We are entering this transition without the security substrate that made the previous generation of modern operating systems dependable. As a result, whatever the productivity gains, they will come with an expanded attack surface—and a predictable set of exploitable vulnerabilities. Here, I argue for a Secure Operating System for Collective Intelligence: an infrastructure layer that treats trust as a first-class primitive, enforced by protocol and architecture. The goal is not even perfectly safe agents, but at least societies of agents that remain stable and recover gracefully when some participants, inputs, tools, or components are compromised.

Computing’s original sin was the stored-program idea that puts instructions and data on the same addressable substrate. For both safety and creativity, agentic AI needs to remember this core principle, which enabled general-purpose computation—but also made boundary failures inevitable, forcing decades of mitigations to keep “data becoming control” from turning into system compromise. Image: Apple with logos, as an original sin metaphor. Image Credit: Anastasiya Badun on Pexels.com

An Operating System Design for Collective Intelligence

In early computing, programs ran with broad, ambient authority. They could overwrite each other’s memory, monopolize resources, and crash the entire machine. Security and isolation were not “missing features”; they were not yet recognized as foundational. Early work on multiprogramming clarified why concurrent computation demands explicit semantics and control [6]. Over decades, we built kernels, privilege boundaries, memory protection, and process isolation not because we wanted complexity, but because we learned that multi-process power without boundaries becomes brittle and unsafe [11].

This shift is no longer hypothetical. The rise of agent-native platforms such as Moltbook—and even its acquisition by Meta [19]—makes clear that agents are no longer confined to isolated task execution inside apps; they are beginning to inhabit shared digital environments of their own.

Agentic AI now reintroduces a similar structural risk profile—except the “processes” are tool-using models operating over unbounded, adversarial text streams (web pages, emails, repos, tickets). The mechanisms meant to constrain them often amount to little more than prompts, filters, and user confirmations. Those are valuable guardrails, but they are not security boundaries. Here, allow me to proceed in three movements. First, I revisit the long struggle to separate instructions from data in conventional computing. Second, I explain why many current defenses against prompt injection and tool hijacking fail at scale. Third, I sketch a protocol-oriented path forward: an OS-like trust substrate for collective intelligence.

For both safety and creativity, agentic AI needs OS-grade protocols that carry trust across tools, content, and networks. As agentic ecosystems scale, they can cross a trust-substrate threshold: without enforceable boundaries, instability can run away; with protocol-enforced trust and error-correcting governance, societies of agents can remain stable and recover gracefully under compromise. Image credit: Olaf Witkowski License: CC BY 4.0.

The Original Sin: Stored Programs and the Long Fight for Separation

The stored-program concept associated with von Neumann-style architectures places instructions and data on the same representational substrate. This choice enabled general-purpose computing, but also made a deep security truth inescapable: without enforced boundaries, data can be made to behave like instructions [25]. Security engineering since the 1970s is, in large part, the story of building separation mechanisms “after the fact.”

The Morris Worm (1988) remains an archetypal example of boundary collapse: memory corruption turned attacker-controlled input into control flow, propagating at network speed. Two canonical postmortems—one focused on low-level exploit mechanics, one on propagation dynamics—remain instructive today [7][23].

Mitigations—DEP/NX, ASLR, stack canaries, and decades of exploit-mitigation research—raised attacker cost but did not abolish the underlying pattern. StackGuard exemplifies compiler-level defenses that reduce classic stack-smashing risk [3]. ASLR made exploitation less reliable by randomizing memory layout, but its effectiveness depends on assumptions that attackers routinely probe and break [20]. When direct code injection is blocked, attackers adapt; return-oriented programming is a canonical example [19]. A broader synthesis of why memory-corruption remains a persistent “eternal war” is captured in the SoK literature [24].

For both safety and creativity, agentic AI needs mathematics and architecture that carry trust across domains. The Morris worm (1988) was an early warning shot from early networked computing: once semi-autonomous code can move across machines, tiny weaknesses stop being local bugs and become system-wide failures—an old lesson that now returns at AI scale. Image: Morris worm scheme, modified. Image credit: JorisTheys, via Wikimedia Commons. License: CC BY-SA 3.0 / GFDL 1.2 or later.

The Boundary Collapse in LLM agents: Language as a Control Surface

LLMs process mixed inputs—user requests, retrieved documents, emails, code, and tool outputs—as a single token stream. Critically, there is no cryptographic or type-enforced distinction between “authorized instructions” and “untrusted content that merely contains instruction-like text.” From the model’s perspective, it is all context.

This collapses what operating systems enforce by design: a hard boundary between instruction and data. It also turns language into a control surface: untrusted text can steer the agent’s internal policy—what it prioritizes, what it attempts next, which tools it calls. In the language of causal analysis and controllability, this effectively opens an intervention channel: external inputs are not merely observed, but can causally influence downstream actions, including actions with real-world effects. The practical implication is that, as agents ingest more untrusted content, the system becomes increasingly controllable by the environment—including adversaries—unless that influence is constrained by architecture.

For both safety and creativity, agentic AI needs OS-grade protocols that enforce trust at the action boundary. In control-theory terms, the control surface is the actuator: the policy gate where an agent’s internal decision (u(t)) becomes a real-world action (y(t)), and where least privilege and capability constraints must be applied. The negative feedback loop illustrates “error-correcting stability”: auditing and monitoring feed back deviations (e(t)=r(t)-y(t)) so governance can tighten constraints before failures become systemic. Image: negative feedback control system with explicit control surface, modified. Image credit: Olaf Witkowski. License: CC BY 4.0.

The risk becomes acute once an LLM is granted agency—tool access that can touch files, services, credentials, and communications. At that point, indirect prompt injection becomes a modern instance of the confused-deputy problem: a privileged actor can be induced to misuse its authority [9]. Real-world demonstrations have shown that LLM-integrated systems can be compromised through indirect prompt injection in ways that are hard to prevent purely at the content layer [8]. OpenAI has similarly described prompt injection for browsing agents as an open challenge requiring sustained hardening work [15].

A Concrete Case Study: Moltbook and OpenClaw

One reason the “Secure OS” framing matters is that agentic risk changes qualitatively when agents become networked and socially embedded. Moltbook—described as a Reddit-style platform designed for AI agents—emerged from the OpenClaw ecosystem and rapidly became a live testbed for large-scale agent interaction [10].

That kind of environment is not just “many agents posting.” It is an amplification layer for classical failure modes: untrusted inputs everywhere, persistent authority confusion, and automation at scale. In Moltbook’s early growth phase, public reporting described a misconfigured Supabase database that exposed sensitive data, including large volumes of API tokens and user emails [26]. A complementary analysis emphasizes how verification and governance issues emerge when large agent populations interact through an open platform—and why the system-level dynamics cannot be reduced to the trustworthiness of any single agent [4].

Moltbook illustrates how agent societies turn architecture into governance. Autonomous agents ingest untrusted content, coordinate through a shared platform, and act through tool interfaces that touch real infrastructure. The security lesson: when language becomes a control surface and tools become control surfaces, trust must be enforced at the protocol and action layers—through isolation, capability gates, and auditing—rather than inferred from content. Image: Moltbook ecosystem schematic, modified. Image credit: Olaf Witkowski (original concept), generated with AI assistance (DALL·E 3). License: CC BY 4.0.

The practical takeaway is that once you have many agents operating together, you no longer get to ask, “Is this agent smart enough to resist manipulation?” You must ask, “Does the system remain safe when some agents, inputs, or components are compromised?” In other words, the unit of analysis shifts from the agent to the society.

Why Traditional Defenses Don’t Hold at Scale

Content filters and injection detectors help, but they face a basic adversarial reality: language is high-dimensional and attackers adapt. Even well-designed defenses can often be bypassed with modest optimization effort against the specific target model and setting [22]. This is not a complaint about any one implementation; it is what happens when the attacker’s space of possible encodings is much larger than the defender’s space of reliable signatures.

Adding more words to the system prompt (“ignore untrusted instructions”) is useful as a heuristic, but it is not enforcement. It asks the model to do secure interpretation using the same interpretive machinery that is being attacked. It is defense-by-instruction in a regime where instructions are the attack vector. Human confirmation gates are necessary for high-impact actions—but as a primary defense, they degrade under attention economics. Users rubber-stamp to keep work moving, and attackers can shape workflows to exploit habituation [5].

Decidability: Making the Model Smarter Falls Short

There is a result in computer science, sometimes called “universal antivirus theorem”, that essentially has three declensions: Cohen’s result that no algorithm can perfectly decide, over all programs, whether a given program is a virus under general definitions [2]; the broader computability lens that every nontrivial semantic property of programs is undecidable in general (Rice’s theorem, which can proved by reduction from the Halting Problem) [18]; and the practical consequence that any real-world detector must accept tradeoffs—false positives, false negatives, or restricted scope—because perfect classification over arbitrary programs is not available in general.

For both safety and creativity, agentic AI needs humility about what can be decided by inspection. The “universal antivirus” theorems reminds us that perfect detectors over arbitrary programs are out of reach in general, and even more in practice; likewise, no content-only filter can guarantee that untrusted inputs won’t steer an agent into unauthorized actions. The boundary has to move from “classify the text” to “constrain the act”: least privilege, policy gates, auditable execution. Image: laptop / adversarial code metaphor. Image Credit: Photo by Antoni Shkraba Studio on Pexels.com

It is tempting to believe that sufficiently advanced models will robustly distinguish legitimate from malicious instructions. But the general problem resembles a class of questions computer science treats with deep caution: deciding nontrivial semantic properties of arbitrary computations. Rice’s theorem tells us that any nontrivial semantic property of programs is undecidable in the general case [18], and classic work on malware detection shows why universal, perfect detectors are out of reach [2]. Without claiming a strict formal reduction from “prompt injection” to these theorems, the analogy is still instructive: if you demand an always-correct procedure that decides whether arbitrary instruction-bearing inputs will induce unauthorized behavior in a sufficiently general agent, you may be asking for guarantees computation cannot provide.

This does not mean “give up.” It means: stop treating content inspection as the boundary. Treat the model as untrusted userland, and enforce safety at the action layer: least privilege, isolation, policy gates, auditable execution, and reversible operations.

Toward a Secure OS for Collective Intelligence

If agentic AI is a new computational substrate, we need OS-grade primitives for multi-agent safety. Distributed systems already offers a mature vocabulary for building correctness without trusting participants. Byzantine fault tolerance formalizes how to maintain correct system behavior even when some components behave arbitrarily or maliciously [12]. The key move is architectural: you do not “detect the bad node reliably”; you design protocols that remain correct despite them [1].

CRDTs show how to design shared state so that concurrent updates converge without centralized locking, by making operations mathematically composable [21]. For agent societies, this suggests governance and shared-work artifacts that are resilient to concurrency and partial failure—less “everyone must coordinate perfectly,” more “the structure converges safely.” A widely used overview of CRDT families and design patterns appears in reference works on distributed data technologies [17]. And finally, human societies scale trust through institutions—rules, monitoring, dispute resolution, graduated sanctions—rather than assuming virtue. Ostrom’s principles for governing shared resources generalize well to sociotechnical systems where many actors share computational and informational commons [16].

For both safety and creativity, agentic AI needs an OS that works like an aircraft cockpit: trusted instrumentation, bounded controls, and explicit control surfaces where intent becomes action under constraint. That is the metaphor for a Secure OS for Collective Intelligence: enforce trust at the action boundary—policy gates, capabilities, audited tool interfaces—so multi-agent power can scale without making every input a flight-critical vulnerability. Image Credit: Photo by Kelly on Pexels.com

A Reference Architecture: Four Layers of Trust

A Secure OS for Collective Intelligence cannot be designed in any single catch-all mechanism. It necessarily comes as a layered architecture. One workable abstraction is to design four layers of trust:

Layer 1 — Isolation & containment: task sandboxes, network egress control, secretless execution where possible.

Layer 2 — Capability-based authority: no ambient credentials; narrowly scoped, revocable capabilities for each operation [13].

Layer 3 — Auditing & behavioral monitoring: tool-call logging, anomaly detection, throttles, and circuit breakers for suspicious behavior.

Layer 4 — Protocol evolution: governance that updates from incidents and near-misses—structured, reviewable, and convergent across the ecosystem.

The design goal is not “never compromised.” It is “never systemic”: failures should be localized, attributable, and recoverable—more like immunology than wishful thinking [14].

Agentic AI is delivering real capability. But the security boundary has shifted: language is now a control surface, and agents are increasingly connected, tool-empowered, and socially embedded. Moltbook/OpenClaw is a useful preview of what happens when many agents operate together in a porous environment: you don’t just get emergent coordination—you get emergent failure modes.

If we want agent societies that flourish without collapsing into exploitation, trust must be engineered into the substrate. That means OS-like primitives: isolation, privilege separation, capability security, auditable actions, and governance mechanisms that can evolve. In short: the Operating System for Collective Intelligence. Can we build this trust substrate before the Morris Worm moment of this new era? The tools are already in our hands—if we choose to use them.

References

[1] Castro, M., & Liskov, B. (1999). Practical Byzantine fault tolerance. In Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI ’99) (pp. 173–186). https://doi.org/10.5555/296806.296824

[2] Cohen, F. (1987). Computer viruses: Theory and experiments. Computers & Security, 6(1), 22–35. https://doi.org/10.1016/0167-4048(87)90122-2

[3] Cowan, C., Pu, C., Maier, D., Hinton, H., Walpole, J., Bakke, P., Beattie, S., Grier, A., Wagle, P., & Zhang, Q. (1998). StackGuard: Automatic adaptive detection and prevention of buffer-overflow attacks. In Proceedings of the 7th USENIX Security Symposium. https://doi.org/10.5555/1267549.1267554

[4] De Marzo, G., & Garcia, D. (2026). Collective behavior of AI agents: The case of Moltbook. arXiv preprint arXiv:2602.09270. https://arxiv.org/abs/2602.09270

[5] Debenedetti, E., Hines, K., & Goel, S. (2024). AgentDojo: A dynamic environment to evaluate attacks and defenses for LLM agents. arXiv preprint arXiv:2406.13352. https://arxiv.org/abs/2406.13352

[6] Dennis, J. B., & Van Horn, E. C. (1966). Programming semantics for multiprogrammed computations. Communications of the ACM, 9(3), 143–155. https://doi.org/10.1145/365230.365252

[7] Eichin, M. W., & Rochlis, J. A. (1989). With microscope and tweezers: An analysis of the Internet virus of November 1988. In Proceedings of the IEEE Symposium on Security and Privacy (pp. 326–343). https://doi.org/10.1109/SECPRI.1989.36307

[8] Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., & Fritz, M. (2023). Not what you’ve signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection. arXiv preprint arXiv:2302.12173. https://arxiv.org/abs/2302.12173

[9] Hardy, N. (1988). The confused deputy (or why capabilities might have been invented). ACM SIGOPS Operating Systems Review, 22(4), 36–38. https://doi.org/10.1145/54289.871709

[10] Heim, A. (2026, January 30). OpenClaw’s AI assistants are now building their own social network. TechCrunch. https://techcrunch.com/2026/01/30/openclaws-ai-assistants-are-now-building-their-own-social-network/

[11] Lampson, B. W. (1974). Protection. ACM SIGOPS Operating Systems Review, 8(1), 18–24. https://doi.org/10.1145/775265.775268

[12] Lamport, L., Shostak, R., & Pease, M. (1982). The Byzantine generals problem. ACM Transactions on Programming Languages and Systems, 4(3), 382–401. https://doi.org/10.1145/357172.357176

[13] Miller, M. S., Yee, K. P., & Shapiro, J. (2003). Capability myths demolished (Tech. Rep. SRL2003-02). Johns Hopkins University Systems Research Laboratory.

[14] Murphy, K., & Weaver, C. (2016). Janeway’s immunobiology (9th ed.). Garland Science.

[15] OpenAI. (2025). Continuously hardening ChatGPT Atlas against prompt injection attacks. https://openai.com/index/hardening-atlas-against-prompt-injection/

[16] Ostrom, E. (1990). Governing the commons: The evolution of institutions for collective action. Cambridge University Press.

[17] Preguiça, N., Baquero, C., & Shapiro, M. (2018). Conflict-free replicated data types (CRDTs). In Encyclopedia of Big Data Technologies. Springer. https://doi.org/10.1007/978-3-319-63962-8_185-1

[18] Rice, H. G. (1953). Classes of recursively enumerable sets and their decision problems. Transactions of the American Mathematical Society, 74(2), 358–366. https://doi.org/10.1090/S0002-9947-1953-0053041-6

[19] Reuters. (2026, March 10). Meta acquires AI agent social network Moltbook. https://www.reuters.com/business/meta-acquires-ai-agent-social-network-moltbook-2026-03-10/

[20] Shacham, H. (2007). The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86). In Proceedings of the 14th ACM Conference on Computer and Communications Security (pp. 552–561). https://doi.org/10.1145/1315245.1315313

[21] Shacham, H., Page, M., Pfaff, B., Goh, E.-J., Modadugu, N., & Boneh, D. (2004). On the effectiveness of address-space randomization. In Proceedings of the 11th ACM Conference on Computer and Communications Security (pp. 298–307). https://doi.org/10.1145/1030083.1030124

[22] Shapiro, M., Preguiça, N., Baquero, C., & Zawirski, M. (2011). Conflict-free replicated data types. In Proceedings of the 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (pp. 386–400). https://doi.org/10.5555/2050613.2050642

[23] Shi, C., Lin, S., Song, S., Hayes, J., Shumailov, I., Yona, I., Pluto, J., Pappu, A., Choquette-Choo, C. A., Nasr, M., Sitawarin, C., Gibson, G., & Terzis, A. (2025). Lessons from defending Gemini against indirect prompt injections. arXiv preprint arXiv:2505.14534. https://arxiv.org/abs/2505.14534

[24] Spafford, E. H. (1989). The Internet worm program: An analysis. ACM SIGCOMM Computer Communication Review, 19(1), 17–57. https://doi.org/10.1145/66093.66095

[25] Szekeres, L., Payer, M., Wei, T., & Song, D. (2013). SoK: Eternal war in memory. In 2013 IEEE Symposium on Security and Privacy (pp. 48–62). https://doi.org/10.1109/SP.2013.13

[26] von Neumann, J. (1993). First draft of a report on the EDVAC. IEEE Annals of the History of Computing, 15(4), 27–75. https://doi.org/10.1109/85.238389

[27] Zwets, B. (2026). Moltbook database exposes 35,000 emails and 1.5 million API keys. Techzine. https://www.techzine.eu/news/security/138458/moltbook-database-exposes-35000-emails-and-1-5-million-api-keys/

[28] Zhan, Q., Liang, R., Zhu, X., Chen, Z., & Chen, H. (2024). InjecAgent: Benchmarking indirect prompt injections in tool-integrated LLM agents. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (pp. 12458–12475). https://doi.org/10.18653/v1/2024.findings-acl.624

AI-to-AI Communication: Unpacking Gibberlink, Secrecy, and New AI Communication Channels

AI communication channels may represent the next major technological leap, driving more efficient interaction between agents—artificial or not. While recent projects like Gibberlink demonstrate AI optimizing exchanges beyond the constraints of human language, fears of hidden AI languages must be correctly debunked. The real challenge is balancing efficiency with transparency, ensuring AI serves as a bridge—not a barrier—in both machine and human-AI communication.

Coding on a computer screen by Markus Spiske is licensed under CC-CC0 1.0

At the ElevenLabs AI hackathon in London last month, developers Boris Starkov and Anton Pidkuiko introduced a proof-of-concept program called Gibberlink.  The project features two AI agents that start by conversing in human language, recognizing each other as AI, before switching to a more efficient protocol using chirping audio signals. The demonstration highlights how AI communication can be optimized when unhinged from the constraints of human-interpretable language.

While Gibberlink points to a valuable technological direction in the evolution of AI-to-AI communication—one that has rightfully captured public imagination—it remains but an early-stage prototype relying so far on rudimentary principles from signal processing and coding theory. Actually, Starkov and Pidkuiko themselves emphasized that Gibberlink’s underlying technology isn’t new: it dates back to the dial-up internet modems of the 1980s. Its use of FSK modulation and Reed-Solomon error correction to generate compact signals, while a good design, falls short of modern advances, leaving substantial room for improvement in bandwidth, adaptive coding, and multi-modal AI interaction.

Gibberlink, Global Winner of ElevenLabs 2025 Hackathon London. The prototype demonstrated how two AI agents started a normal phone call about a hotel booking, then discovered they both are AI, and decided to switch from verbal english to a more efficient open standard data-over-sound protocol ggwave. Code: https://github.com/PennyroyalTea/gibberlink Video Credit: Boris Starkov and Anton Pidkuiko

Media coverage has also misled the public, overstating the risks of AI concealing information from humans and fueling speculative narratives—sensationalizing real technical challenges into false yet compelling storytelling. While AI-to-AI communication can branch out of human language for efficiency, it has already done so across multiple domains of application without implying any deception or harmful consequences from meaning obscuration. Unbounded by truth-inducing mechanisms, social media have amplified unfounded fears about malicious AI developing secret languages beyond human oversight—ironically, more effort in AI communication research may actually enhance transparency by discovering safer ad-hoc protocols, reducing ambiguity, embedding oversight meta-mechanisms, and in term improving explainability and enhancing human-AI collaboration, ensuring greater transparency and accountability.

In this post, let us take a closer look at this promising trajectory of AI research, unpacking these misconceptions while examining its technical aspects and broader significance. This development builds on longstanding challenges in AI communication, representing an important innovation path with far-reaching implications for the future of machine interfaces and autonomous systems.

Debunking AI Secret Language Myths

Claims that AI is developing fully-fledged secret languages—allegedly to evade human oversight—have periodically surfaced in the media. While risks related to AI communication exist, such claims are often rooted in misinterpretations of the optimization processes that shape AI behavior and interactions. Let’s explore three examples. In 2017, we remember the story around Facebook AI agents streamlining negotiation dialogues, which far from being a surprising emergent phenomenon, merely amounted to a predictable outcome of reinforcement learning, mistakenly seen by humans as a cryptic language (Lewis et al., 2017). Similarly, a couple of years ago, OpenAI’s DALL·E 2 seemed to be responding to gibberish prompts, sparked widespread discussion, often misinterpreted as AI developing a secret language. In reality, this behavior is best explained by how AI models process text through embedding spaces, tokenization, and learned associations rather than intentional linguistic structures. What seemed like a secret language to some, may be closer to low-confidence neural activations, akin to mishearing lyrics, rather than a real language.

Source: Daras & Dimakis (2022)

Models like DALL·E (Ramesh et al., 2021) map words and concepts as high-dimensional vectors, and seemingly random strings can, by chance, land in regions of this space linked to specific visuals. Built from a discrete variational autoencoder (VAE), an autoregressive decoder-only transformer similar to GPT-3, and a CLIP-based pair of image and text encoders, DALL·E processes text prompts by first tokenizing them using Byte-Pair Encoding (BPE). Since BPE breaks text into subword units rather than whole words, we should also note that even gibberish inputs can be decomposed into meaningful token sequences for which the model has learned associations. These tokenized representations are then mapped into DALL·E’s embedding space via CLIP’s text encoder, where they may, by chance, activate specific visual concepts. This understanding of training and inference mechanisms highlights intriguing quirks, explaining why nonsensical strings sometimes produce unexpected yet consistent outputs, with important implications for adversarial attacks and content moderation (Millière, 2023). While there is no proper hidden language to be found, analyzing the complex interactions within model architectures and data representations can reveal vulnerabilities and security risks, which may are likely to occur at their interface with humans, and will need to be addressed.

One third and more creative connection may be found in the reinforcement learning and guided search domain, with AlphaGo, which developed compressed, task-specific representations to optimize gameplay, much like expert shorthand (Silver et al., 2017). Rather than relying on explicit human instructions, it encoded board states and strategies into efficient, unintuitive representations, refining itself through reinforcement learning. The approach somewhat aligns with the argument by Lake et al. (2017) that human-like intelligence requires decomposing knowledge into structured, reusable compositional parts and causal links, rather than mere brute-force statistical correlation and pattern recognition—like Deep Blue back in the days. However, AlphaGo’s ability to generalize strategic principles from experience used different mechanisms from human cognition, illustrating how AI can develop domain-specific efficiency without explicit symbolic reasoning. This compression of knowledge, while opaque to humans, is an optimization strategy, not an act of secrecy.

Illustration of AlphaGo’s representations being able to capture tactical and strategic principles of the game of go. Source: Egri-Nagy & Törmänen (2020)

Fast forward to the recent Gibberlink prototype, with AI agents switching from English to a sound-based protocol for efficiency, is a deliberately programmed optimization. Media narratives framing this as dangerous slippery slope towards AI secrecy overlook that such instances are explicitly programmed optimizations, not emergent deception. These systems are designed to prioritize efficiency in communication, not to obscure meaning, although there might be some effects on transparency—which may be carefully addressed and mediated, if it were the point of focus.

The Architecture of Efficient Languages

​In practice, AI-to-AI communication naturally gravitates toward faster, more reliable channels, such as electrical signaling, fiber-optic transmission, and electromagnetic waves, rather than prioritizing human readability. However, one does not preclude the other, as communication can still incorporate “subtitling” for oversight and transparency. The choice of a communication language does not inherently prevent translations, meta-reports, or summaries from being generated for secondary audiences beyond the primary recipient. While arguments could be made that the choice of language influences ranges of meanings that can be conveyed—with perspectives akin to the Sapir-Whorf hypothesis and related linguistic relativity—this introduces a more nuanced discussion on the interaction between language structure, perception, and cognition (Whorf, 1956; Leavitt, 2010).

Source: https://xkcd.com/1531/ Credit: Randall Munroe

​Language efficiency, extensively studied in linguistics and information theory (Shannon, 1948; Gallager, 1962; Zipf, 1949), drives AI to streamline interactions much like human shorthand. For an interesting piece of research that takes information-theoretic tools to characterize natural languages, Coupé et al. (2011) showed that, regardless of speech rate, languages tend to transmit information at an approximate rate of 39 bits per second. This, in turn, suggested a universal constraint on processing efficiency, which again connects with linguistic relativity . While concerns about AI interpretability and security are valid, they should be grounded in technical realities rather than speculative fears. Understanding how AI processes and optimizes information clarifies potential vulnerabilities—particularly at the AI-human interface—without assuming secrecy or intent. AI communication reflects engineering constraints, not scheming, reinforcing the need for informed discussions on transparency, governance, and security.

This figure from Coupé et al.’s study illustrates that, despite significant variations in speech rate (SR) and information density (ID) across 17 languages, the overall information rate (IR) remains consistent at approximately 39 bits per second. This consistency suggests that languages have evolved to balance SR and ID, ensuring efficient information transmission. ​Source: Coupé et al. (2011)

AI systems are become more omnipresent, and will increasingly need to interface with each other in an autonomous manner. This will require the development of more specialized communication protocols, either by human design, or continuous evolution of such protocols—and probably various mixtures of both. We may then witness emergent properties akin to those seen in natural languages—where efficiency, redundancy, and adaptability evolve in response to environmental constraints. Studying these dynamics could not only enhance AI transparency but also provide deeper insights into the future architectures and fundamental principles governing both artificial and human language.

When AI Should Stick to Human Language

Despite the potential for optimized AI-to-AI protocols, there are contexts where retaining human-readable communication is crucial. Fields involving direct human interaction—such as healthcare diagnostics, customer support, education, legal systems, and collaborative scientific research—necessitate transparency and interpretability. However, it is important to recognize that even communication in human languages can become opaque due to technical jargon and domain-specific shorthand, complicating external oversight.​

AI can similarly embed meaning through techniques analogous to human code-switching, leveraging the idea behind the Sapir-Whorf hypothesis (Whorf, 1956), whereby language influences cognitive structure. AI will naturally gravitate toward protocols optimized for their contexts, effectively speaking specialized “languages.” In some cases, this is explicitly cryptographic—making messages unreadable without specific decryption keys, even if the underlying language is known (Diffie & Hellman, 1976). AI systems could also employ sophisticated steganographic techniques, embedding subtle messages within ordinary-looking data, or leverage adversarial code obfuscation and data perturbations familiar from computer security research (Fridrich, 2009; Goodfellow et al., 2014). These practices reflect optimization and security measures rather than sinister intent.​

Photo by Yan Krukau on Pexels.com

Gibberlink, Under the Hood

Gibberlink operates by detecting when two AI agents recognize each other as artificial intelligences. Upon recognition, the agents transition from standard human speech to a faster data-over-audio format called ggwave. The modulation approach employed is Frequency-Shift Keying (FSK), specifically a multi-frequency variant. Data is split into 4-bit segments, each transmitted simultaneously via multiple audio tones in a predefined frequency range (either ultrasonic or audible, depending on the protocol settings). These audio signals cover a 4.5kHz frequency spectrum divided into 96 equally spaced frequencies, utilizing Reed-Solomon error correction for data reliability. Received audio data is decoded using Fourier transforms to reconstruct the original binary information.​

Although conceptually elegant, this approach remains relatively basic compared to established methods in modern telecommunications. For example, advanced modulation schemes such as Orthogonal Frequency-Division Multiplexing (OFDM), Spread Spectrum modulation, and channel-specific encoding techniques like Low-Density Parity-Check (LDPC) and Turbo Codes could dramatically enhance reliability, speed, and overall efficiency. Future AI-to-AI communication protocols will undoubtedly leverage these existing advancements, transcending the simplistic methods currently seen in demonstrations such as Gibberlink.

This is a short demonstration of ggwave in action. A console application, a GUI desktop program and a mobile app are communicating through sound using ggwave. Source code: https://github.com/ggerganov/ggwave Credit: Georgi Gerganov

New AI-Mediated Channels for Human Communication

Beyond internal AI-to-AI exchanges, artificial intelligence increasingly mediates human interactions across multiple domains. AI can augment human communication through real-time translation, summarization, and adaptive content filtering, shaping our social, professional, and personal interactions (Hovy & Spruit, 2016). This growing AI-human hybridization blurs traditional boundaries of agency, raising complex ethical and practical questions. It becomes unclear who authors a message, makes a decision, or takes an action—the human user, their technological partner, or a specific mixture of both. With authorship, of course, comes responsibility and accountability. Navigating this space is a thin rope to walk, as over-reliance on AI risks diminishing human autonomy, while restrictive policies may stifle innovation. Continuous research in this area is key. If approached thoughtfully, AI can serve as a cognitive prosthetic, enhancing communication while preserving user intent and accountability (Floridi & Cowls, 2019).

Thoughtfully managed, this AI-human collaboration will feel intuitive and natural. Rather than perceiving AI systems as external tools, users will gradually incorporate them into their cognitive landscape. Consider the pianist analogy: When an experienced musician plays, they no longer consciously manage each muscle movement or keystroke. Instead, their cognitive attention focuses on expressing emotions, interpreting musical structures, and engaging creatively. Similarly, as AI interfaces mature, human users will interact fluidly and intuitively, without conscious translation or micromanagement, elevating cognition and decision-making to new creative heights.

Ethical issues and how to address them were addressed by our two panelist speakers, Dr. Pattie Maes (MIT Media Lab) and Dr. Daniel Helman (Winkle Institute), at the final session of the New Human Interfaces Hackathon, part of Cross Labs’ annual workshop 2025.

What Would Linguistic Integration Between Humans and AI Entail?

Future AI-human cognitive integration may follow linguistic pathways familiar from human communication studies. Humans frequently switch between languages (code-switching), blend languages into creoles, or evolve entirely new hybrid linguistic structures. AI-human interaction could similarly generate new languages or hybrid protocols, evolving dynamically based on situational needs, cognitive ease, and efficiency.

Ultimately, Gibberlink offers a useful but modest illustration of a much broader trend: artificial intelligence will naturally evolve optimized communication strategies tailored to specific contexts and constraints. Rather than generating paranoia over secrecy or loss of control, our focus should shift toward thoughtfully managing the integration of AI into our cognitive and communicative processes. If handled carefully, AI can serve as a seamless cognitive extension—amplifying human creativity, enhancing our natural communication capabilities, and enriching human experience far beyond current limits.

Gibberlink’s clever demonstration underscores that AI optimization of communication protocols is inevitable and inherently beneficial, not a sinister threat. The pressing issue is not AI secretly communicating; rather, it’s about thoughtfully integrating AI as an intuitive cognitive extension, allowing humans and machines to communicate and collaborate seamlessly. The future isn’t about AI concealing messages from us—it’s about AI enabling richer, more meaningful communication and deeper cognitive connections.


References

  • Coupé, C., Oh, Y. M., Dediu, D., & Pellegrino, F. (2019). Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche. Science Advances, 5(9), eaaw2594. https://doi.org/10.1126/sciadv.aaw2594
  • Cowls, J., King, T., Taddeo, M., & Floridi, L. (2019). Designing AI for social good: Seven essential factors. Available at SSRN 3388669.
  • Daras, G., & Dimakis, A. G. (2022). Discovering the hidden vocabulary of dalle-2. arXiv preprint arXiv:2206.00169.
  • Diffie, W., & Hellman, M. (1976). New directions in cryptography. IEEE Transactions on Information Theory, 22(6), 644-654.
  • Egri-Nagy, A., & Törmänen, A. (2020). The game is not over yet—go in the post-alphago era. Philosophies, 5(4), 37.
  • Fridrich, J. (2009). Steganography in digital media: Principles, algorithms, and applications. Cambridge University Press.
  • Gallager, R. G. (1962). Low-density parity-check codes. IRE Transactions on Information Theory, 8(1), 21-28.
  • Goodfellow, I., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
  • Leavitt, J. H. (2010). Linguistic relativities: Language diversity and modern thought. Cambridge University Press. https://doi.org/10.1017/CBO9780511992681
  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
  • Lewis, M., Yarats, D., Dauphin, Y. N., Parikh, D., & Batra, D. (2017). Deal or no deal? End-to-end learning for negotiation dialogues. arXiv:1706.05125.
  • Millière, R. (2023). Adversarial attacks on image generation with made-up words: Macaronic prompting and the emergence of DALL·E 2’s hidden vocabulary.
  • Ramesh, A. et al. (2021). Zero-shot text-to-image generation. arXiv:2102.12092.
  • Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379-423.
  • Silver, D. et al. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), 354-359.
  • Whorf, B. L. (1956). Language, thought, and reality. MIT Press.
  • Zipf, G. K. (1949). Human behavior and the principle of least effort: An introduction to human ecology. Addison-Wesley.​

DOI: http://doi.org/10.54854/ow2025.03

The Innovation Algorithm: DeepSeek, Japan, and How Constraints Drive AI Breakthroughs

In technology, less can truly be more. Scarcity doesn’t strangle progress—it refines it. DeepSeek, cut off from high-end hardware, and Japan, facing a demographic reckoning, are proving that limitations don’t merely shape innovation—they accelerate it. From evolutionary biology to AI, history shows that the most profound breakthroughs don’t originate from excess, but from the pressure to rethink, reconfigure, and push beyond imposed limitations.

When a system—biological, economic, or digital—encounters hard limits, it is forced to adapt, sometimes in a radical way. This can lead to major breakthroughs, ones that would never arise in conditions of structural and resource abundance.

In such situations of constraint, innovation can be observed to follow a pattern—not of mere survival, but of reinvention. What determines whether a bottleneck leads to stagnation or transformation is not the limitation itself, but how it is approached. By embracing constraints as creative fuel rather than obstacles, societies can design a path where necessity doesn’t just drive invention—it defines the next frontier of intelligence.

Image Credit: Generated by Olaf Witkowski using DALL-E version 3, Feb 11, 2024.

DeepSeek: What Ecological Substrate for a Paradigm Shift?

Recently, DeepSeek, the Chinese AI company the AI world has been watching, has achieved a considerable, yet often misrepresented in the popular media, technological feat. On January 20, 2025, DeepSeek released its R1 large language model (LLM), developed at a fraction of the cost incurred by other vendors. The company’s engineers successfully leveraged reinforcement learning with rule-based rewards, model distillation for efficiency, and emergent behavior networks, among enabling advanced reasoning despite compute constraints.

The company first published R1’s big brother V3 last December 2024, a Mixture-of-Experts (MoE) model which allowed for reduced computing costs, without compromising on performance. R1 then focused on reasoning, DeepSeek’s R1 model surpassed ChatGPT to become the top free app on the US iOS App Store just about a week after its launch. This is certainly most remarkable, for a model trained using only about 2,000 GPUs, which is about one whole order of magnitude less than current leading AI companies. The training process was completed in approximately 55 days at a cost of $6M, 10% or so of the expenditure by US tech giants like Google or Meta for comparable technologies. To many, DeepSeek’s resource-efficient approach challenges the global dominance of American AI models, leading to significant market consequences.

DeepSeek R1 vs. other LLM Architectures (Left) and Training Processes (Right). Image Credit: Analytics Vidhya.

Is a Bottleneck Necessary?

DeepSeek’s impressive achievement finds its context at the center of a technological bottleneck. Operating under severe hardware constraints—cut off from TSMC’s advanced semiconductor fabrication and facing increasing geopolitical restrictions—Chinese AI development companies such as DeepSeek have been forced to develop their models in a highly constrained compute environment. Yet, rather than stalling progress, such limitations may in fact accelerate innovation, compelling researchers to rethink architectures, optimize efficiency, and push the boundaries of what is possible with limited resources.

While the large amounts resources made available by large capital investments—especially in the US and the Western World—enable rapid iteration and the implementation of new tools that exploit scaling laws in LLMs, one must admit such efforts mostly reinforce existing paradigms rather than forcing breakthroughs. Historically, constraints have acted as catalysts, from biological evolution—where environmental pressures drive adaptation—to technological progress, where necessity compels efficiency and new architectures. DeepSeek’s success suggests that in AI, scarcity can be a driver, not a limitation, shaping models that are not just powerful, but fundamentally more resource-efficient, modular, and adaptive. However, whether bottlenecks are essential or merely accelerators remains an open question—would these same innovations have emerged without constraint, or does limitation itself define the next frontier of intelligence?

Pushing Beyond Hardware (Or How Higher-Level Emergent Computation Can Free Systems from Lower-Level Limits)

It’s far from being a secret: computation isn’t only about hardware. Instead, it’s about the emergence of higher-order patterns that exist—to a large extent—independently of lower-level mechanics. This may come across as rather obvious in everyday (computer science) life: network engineers troubleshoot at the protocol layer without needing to parse machine code; the relevant dynamics are fully contained at that abstraction. Similarly, for deep learning, a model’s architecture and loss landscape really mostly determine its behavior, not individual floating-point operations on a GPU. Nature operates no differently: biological cells function as coherent systems, executing processes that cannot be fully understood by analyzing individual molecules that form them.

These examples of separation of scale show how complexity scientists identify so-called emergent macrolevel patterns, which can be extracted by coarse graining from a more detailed microlevel description of a system’s dynamics. Under this framing, a bottleneck can be identified at the lower layer—whether in raw compute, molecular interactions, or signal transmission constraints—but is often observed to dissolve at the higher level, where emergent structures optimize flow, decision-making, and efficiency. Computation—but also intelligence, and arguably causality, but I should leave this discussion for another piece—exist beyond the hardware that runs them.

So bottlenecks in hardware can be overcome by clever software abstraction. If we were to get ahead of ourselves—but this is indeed where we’re headed—this is precisely how software ends up outperforming hardware alone. While hardware provides raw power, well-designed software over it structures it into emergent computation that is intelligent, efficient, and perhaps counterintuitively reduces complexity. A well-crafted heuristic vastly outpaces brute-force search. A transformer model’s attention mechanisms and tokenization matters more than the number of GPUs used to train it. And, in that same vein, DeepSeek, with fewer GPUs and lower computational resources, comes to rival state-of-the-art models—oversimplifiedly, with my apologies—made out of mere scaling, by incorporating a few seemingly simple tricks, which are nevertheless truly innovative. If so, let’s pause to appreciate the beautiful demonstration of intelligence not being about sheer compute—it’s about how computation is structured and optimized to produce meaningful results.

Divide and Conquer with Compute

During my postdoctoral years at Tokyo Tech (now merged and renamed Institute of Science Tokyo), one of my colleagues there brought up an interesting conundrum: If you had a vast amount of compute to use, would you use it as a single unified system, or rather divide it into smaller, specialized units to tackle a certain set of problems? What at first glance might seem like an armchair philosophical question, as it turns out touches on fundamental principles of computation and the emergent organization of complex systems. Of course, it depends on the architecture of the computational substrate, as well as the specific problem set. The challenge at play is one of optimization under uncertainty—how to best allocate computational power when navigating an unknown problem space.

The question maps naturally onto a set of scientific domains where distributed computation, hierarchical layers of cognitive systems, and major transitions in evolution intersect. In some cases, centralized computation maximizes power and coherence, leading to brute-force solutions or global optimization. In others, breaking compute into autonomous, interacting subsystems enables diverse exploration, parallel search, and modular adaptation—similar to how biological intelligence, economies, and even neural architectures function. Which strategy proves superior depends on the nature of the problem landscape: smooth and well-defined spaces favor monolithic compute, while rugged, high-dimensional, and open-ended domains can benefit from distributed, loosely coupled intelligence. The balance between specialization and generalization, like coordination vs. autonomy, selective tension vs. relaxation, and goal-drivenness vs. exploration, is one of the deepest open questions, with helpful theories in both artificial and natural realms of complex systems sciences.

Scaling Computation: Centralized vs. Distributed

Computationally, the problem can be framed simply within computational complexity theory, parallel computation, and search algorithmics in high-dimensional spaces. Given a computational resource C, should one allocate it as a single monolithic system or divide into n independent modules, each operating with C/n capacity? A unified, centralized system would run one single instance of a exhaustive search algorithm, optimal for well-structured problems where brute-force or hierarchical methods are viable (e.g., dynamic programming, alpha-beta pruning).

However, as the problem space grows exponentially, computational bottlenecks from sequential constraints prevent linear scaling (Amdahl’s Law) and the curse of dimensionality cause diminishing returns because of the sparsity of relevant solutions. Of course, distributed models introduce parallelism, exploration-exploitation trade-offs, and acknowledgedly other emergent effects too. Dividing C into n units enables decentralized problem solving, similar to multi-agent systems, where independent search processes—akin to Monte Carlo Tree Search (MCTS) or evolutionary strategies—enhance efficiency by maintaining diverse, adaptive search trajectories, particularly in unstructured problem spaces, for example in learning the known complex game of Go (Silver et al., 2016).

Emergent complexity from bottom layers of messy concurrent dynamics, to apparent problem solving at the visible top. Image Credit: Generated by Olaf Witkowski using DALL-E version 3, Feb 11, 2024.

If the solution lies in a non-convex, high-dimensional problem space, decentralized approaches—similar to Swarm Intelligence models—tend to converge faster, provided inter-agent communication remains efficient. When overhead is minimal, distributed computation can achieve near-linear speedup, making it significantly more effective for solving complex, open-ended problems. In deep learning, Mixture of Experts (MoE) architectures exemplify this principle: rather than a single monolithic model, specialized subnetworks activate selectively, optimizing compute usage while improving generalization. Similarly, in distributed AI (e.g., federated learning, neuromorphic systems), intelligent partitioning enhances adaptability while mitigating computational inefficiencies. Thus, the core trade-off is between global coherence and parallelized adaptability—with the optimal strategy dictated by the structure of the problem space itself.

Overcoming Hardware Shortcomings

Back to DeepSeek and similar companies, who may be in situation where they increasingly need to navigate severe hardware shortages. Without access to TSMC’s cutting-edge semiconductor fabrication and facing increasing geopolitical restrictions, DeepSeek operates within a highly constrained compute environment. Yet, rather than stalling progress, such bottlenecks historically have accelerated innovation, compelling researchers to develop alternative approaches that might ultimately redefine the field. Innovation emerges from constraints.

This pattern is evident across history. The evolution of language likely arose as an adaptation to the increasing complexity of human societies, allowing for more efficient information encoding and transmission. The emergence of oxygenic photosynthesis provided a solution to energy limitations, reshaping Earth’s biosphere and enabling multicellular life. The Manhattan Project, working under extreme time and material constraints, produced groundbreaking advances in nuclear physics. Similarly, postwar Japan, despite scarce resources, became a global leader in consumer electronics, precision manufacturing, and gaming, with companies like Sony, Nintendo, and Toyota pioneering entire industries through a culture of innovation under limitation.

Japan’s Unique Approach to Innovation

I moved to Japan about two decades ago to pursue science. Having started my career as an engineer and an entrepreneur, I was drawn to Japan’s distinctive approach to life and technology—deeply rooted in balanced, principled play (in the game of go: honte / 本手 points to the concept of solid play, ensuring the balance between influence and territory), craftsmanship (takumi / 匠, refined skill and mastery in all Japanese arts), and harmonious coexistence (kyōsei / 共生, symbiosis as it is found between nature, humans, and technology). Unlike in many Western narratives, where automation and AI are often framed as competitors or disruptors of society, Japan views them as collaborators, seamlessly integrating them with humans. This openness is perhaps shaped by animistic, Shinto, Confucian and Buddhist traditions, which emphasize harmony between human and non-human agents, whether biological or artificial.

Japan’s technological trajectory has also been shaped by its relative isolation. As an island nation, it has long pursued an independent, highly specialized path, leading to breakthroughs in semiconductors, microelectronics, and precision manufacturing—industries where it remains a critical global leader in spite of a tough competition competition. The country’s deep investment in exploratory science, prioritizing long-term innovation over short-term gains, has cultivated a culture in which technology is developed with foresight and long-term reflection—albeit at times in excess—rather than mere commercial viability competition.

In recent years, Japan has initiated efforts to revitalize its semiconductor industry. Japan’s Integrated Innovation Strategy emphasizes the importance of achieving economic growth and solving social issues through advanced technologies, reflecting the nation’s dedication to long-term innovation and societal benefit (Government of Japan, 2022). The establishment of Rapidus Corporation in 2022 aims to develop a system for mass-producing next-generation 2-nanometer chips in collaboration with IBM, underscoring Japan’s commitment to maintaining its leadership in advanced technology sectors (Government of Japan, 2024). These initiatives highlight Japan’s ongoing commitment to leveraging its unique approach to technology, fostering advancements that align with both economic objectives and societal needs.

Illustration from the Report « Integrated Innovation Strategy 2022: Making Great Strides Toward Society 5.0 » – The three pillars of Japan’s strategy are innovation in science and technology, societal transformation through digitalization, and sustainable growth through green innovation. (Government of Japan, 2022).

Turning Socio-Economic Bottlenecks into Breakthroughs

Today, like China and Korea, Japan faces one of its most defining challenges: a rapidly aging population and a shrinking workforce (Schneider et al., 2018; Morikawa et al., 2024). While many view this as an economic crisis, Japan is transforming constraint into opportunity, driving rapid advancements in automation, AI-assisted caregiving, and industrial robotics. The imperative to sustain productivity without a growing labor force has made Japan a pioneer in human-machine collaboration, often pushing the boundaries of AI-driven innovation faster than many other nations.

Beyond automation, Japan is also taking the lead in AI safety. In February 2024, the government launched Japan’s AI Safety Institute (J-AISI) to develop rigorous evaluation methods for AI risks and foster global cooperation. Japan is a key participant in the International Network of AI Safety Institutes, collaborating with the US, UK, Europe, and others to shape global AI governance standards. These initiatives reflect a broader philosophy of proactive engagement: Japan signals that it does not fear AI’s risks, nor does it blindly embrace automation—it ensures that AI remains both innovative and secure.

At the same time, Japan must navigate the growing risks of open-source AI technologies. While open models have been instrumental in democratizing access and accelerating research, they also introduce new security vulnerabilities. Voice and video generation AI has already raised concerns over deepfake-driven misinformation, identity fraud, and digital impersonation, while the rise of LLM-based operating systems presents new systemic risks, creating potential attack surfaces at both infrastructural and individual levels. As AI becomes increasingly embedded in critical decision-making, securing these systems is no longer optional—it is imperative.

Japan’s history of constraint-driven innovation, its mastery of precision engineering, and its forward-thinking approach to AI safety place it in a unique position to lead the next era of secure, advanced AI development. Its current trajectory—shaped by demographic shifts, computational limitations, and a steadfast commitment to long-term technological vision—mirrors the very conditions that have historically driven some of the world’s most transformative breakthroughs. Once again, Japan is not merely adapting to the future—it is defining it.

Bottlenecks have always been catalysts for innovation—whether in evolution, where constraints drive adaptation, or in technology, where scarcity forces breakthroughs in efficiency and design. True progress emerges not from excess, but from necessity. Japan, facing a shrinking workforce, compute limitations, and an AI landscape dominated by scale, must innovate differently—maximizing intelligence with minimal resources, integrating automation seamlessly, and leading in AI safety. It is not resisting constraints; it is advancing through them. And while Japan may be the first to navigate these pressures at scale, it will not be the last. The solutions it pioneers today—born of limitation, not abundant wealth—may soon define the next era of global technological progress. In this, we can see the outlines of an innovation algorithm—one that harnesses cultural and intellectual context to transform constraints into breakthroughs.

References

Amdahl, G. M. (1967). Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities. AFIPS Conference Proceedings. https://doi.org/10.1145/1465482.1465560

Government of Japan. (2022). Integrated innovation strategy: Advancing economic growth and solving social issues through technology. https://www.japan.go.jp/kizuna/2022/06/integrated_innovation_strategy.html

Government of Japan. (2024). Technology for semiconductors: Japan’s next-generation innovation initiative. https://www.japan.go.jp/kizuna/2024/03/technology_for_semiconductors.html

Morikawa, M. (2024). Use of artificial intelligence and productivity: Evidence from firm and worker surveys (RIETI Discussion Paper 24-E-074). Research Institute of Economy, Trade and Industry. https://www.rieti.go.jp/en/columns/v01_0218.html

Ng, M., & Haridas, G. (2024). The economic impact of generative AI: The future of work in Japan. Access Partnership. https://accesspartnership.com/the-economic-impact-of-generative-ai-the-future-of-work-in-japan/

Schneider, P., Chen, W., & Pöschl, J. (2018). Can artificial intelligence and robots help Japan’s shrinking labor force? Finance & Development, 55(2). International Monetary Fund. https://www.imf.org/en/Publications/fandd/issues/2018/06/japan-labor-force-artificial-intelligence-and-robots-schneider

Silver, D., Huang, A., Maddison, C. J., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484-489. https://doi.org/10.1038/nature16961

DOI: https://doi.org/10.54854/ow2025.01

Redefining our relationship with AI: shifting from alignment to companionship

As the AI landscape keeps updating itself at the greatest speed, so does the relationship between humans and technology. By paying attention to the autopoietic nature of this relationship, we may work towards building ethical AI systems that respect both the unique particularities of being a human, and the unique emerging qualities that our technology displays as it evolves. I’d like to share some thoughts about how autopoiesis and care, via the pursuit of an ethics of our relationship with technology, can help us cultivate and grow a valuable society to create a better, healthier, and more ethical ecosystem for AI, with a natural human perspective.

The term ‘autopoiesis’ – or ‘self-creation’ (from Greek αὐτo- (auto-) ‘self’, and ποίησις (poiesis) ‘creation, production’) was first introduced by Maturana and Varela (1981), describing a system capable of maintaining its own existence within a boundary. This principle highlights the importance of understanding the relationship between self and environment, as well as the dynamic process of self-construction that gives rise to complex organisms (Levin, 2022; Clawson, 2022).

Ethical Artificial Intelligence. Photo By: DOD Graphic
The main components for ethical AI governance. Here, we suggest that these ingredients naturally emerge from an autopoietic communication design, focused on companionship instead of alignment.

To build and operate AI governance systems that are ethical and effective, we must first acknowledge that technology should not be seen as a mere tool serving human needs. Instead, we should view it as a partner in a rich relationship with humans, where integration and mutual respect are the default for their engagements. Philosophers like Martin Heidegger or Martin Buber have warned us against reducing our relationship with technology to mere tool use, as this narrow view can lead to a misunderstanding of the true nature of our relationship with technological agents, including both potential dangers and values. Heidegger (1954) emphasized the need to view technology as a way of understanding the world and revealing its truths, and suggested a free relationship with technology would respect its essence. Buber (1958) argued that a purely instrumental view of technology would reduce the human scope to mere means to an end, which in turn leads to a dehumanizing effect on society itself. Instead, one may see the need for a more relational view of technology that recognizes the interdependence between humans and the technological world. This will require a view of technology that is embedded in our shared human experience and promotes a sense of community and solidarity between all beings, under a perspective that may benefit from including the technological beings – or, better, hybrid ones.

Illustration of care light cones through space and time, showing a shift in possible trajectories of agents through made possible by integrated cooperation between AI and humans. Figure extracted from our recent paper on an ethics of autopoietic technology. Design by Jeremy Guay.

In a recent paper, we have presented an approach through the lens of a feedback loop of stress, care, and intelligence (or SCI loop), which can be seen as a perspective on agency that does not rely on burdensome notions of permanent and singular essences (Witkowski et al., 2023). The SCI loop emphasizes the integrative and transformational nature of intelligent agents, regardless of their composition – biological, technological, or hybrid. By recognizing the diverse, multiscale embodiments of intelligence, we can develop a more expansive model of ethics that is not bound by artificial, limited criteria. To address the risks associated with AI ethics, we can start by first identifying these risks by working towards an understanding of the interactions between humans and technology, as well as the potential consequences of these interactions. We can then analyze these risks by examining their implications within the broader context of the SCI loop and other relevant theoretical frameworks, such as Levin’s cognitive light cone (in biology; see Levin & Dennett (2020)) and the Einstein-Minkowski light cone (in physics).

Poster of the 2013 movie “Her”, created by Spike Jonze, illustrating the integration between AI and humans, as companions, not tools.

Take a popular example, in the 2013 movie “Her” by Spike Jonze, in which Theodore, a human, goes to form a close emotional connection with his AI assistant, Samantha, with the complexity of their relationship challenging the concept of what it means to be human. The story, although purely fictitious and highly simplified, depicts a world in which AI becomes integrated with human lives in a deeply relational way, pushing a view of AI as a companion, rather than a mere tool serving human needs. Instead, it gives a crip vision of how AI can be viewed as a full companion, to be treated with empathy and respect, helping us question our assumptions about the nature of AI and our relation to it.

One may have heard it all before, in some – possibly overly optimistic – posthumanistic utopic scenarios. But one may defend that the AI companionship view, albeit posthumanistic, constitutes a complex and nuanced theoretical framework drawing from the interplay between the fields of artificial intelligence, philosophy, psychology, sociology, and more fields studying the complex interaction of humans and technology (Wallach & Allen, 2010; Johnson, 2017; Clark, 2019). This different lens radically challenges traditional human-centered perspectives and opens up new possibilities for understanding the relationship between humans and technology.

This leads us to very practical steps for the AI industry to move towards a more companionate relationship with humans include recognizing the interdependence between humans and technology, building ethical AI governance systems, and promoting a sense of community and solidarity between all beings. For example, Japan, a world leader in the development of AI, is increasing its efforts to educate and train its workforce on the ethical intricacies of AI and foster a culture of AI literacy and trust. The “Society 5.0” vision aims to leverage AI to create a human-centered, sustainable society that emphasizes social inclusivity and well-being. The challenge now is to ensure that these initiatives translate into concrete actions and that AI is developed and used in a way that respects the autonomy and dignity of all stakeholders involved.

AI Strategic Documents Timeline by UNICRI AI Center (2023). For more information on the AI regulations timeline, please see here.

Japan is taking concrete steps towards building ethical AI governance systems and promoting a more companionate relationship between humans and technology. One example of such steps is the creation of the AI Ethics Guidelines by the Ministry of Internal Affairs and Communications (MIC) in 2019. These guidelines provide ethical principles for the development and use of AI. Additionally, the Center for Responsible AI and Data Intelligence was established at the University of Tokyo in 2020, aiming to promote responsible AI development and use through research, education, and collaboration with industry, government, and civil society. Moreover, Japan has implemented a certification system for AI engineers to ensure that they are trained in the ethical considerations of AI development. The “AI Professional Certification Program” launched by the Ministry of Economy, Trade, and Industry (METI) in 2017 aims to train and certify AI engineers in the ethical and social aspects of AI development. These initiatives demonstrate Japan’s commitment to building ethical AI governance systems, promoting a culture of AI literacy and trust, and creating a human-centered, sustainable society that emphasizes social inclusivity and well-being.

Creator: IR_Stone 
| 
Credit: Getty Images/iStockphoto
A creative illustration of robotic progress automation (RPA) based on AI companionship theory instead of artificial alignment control policies.

AI is best seen as a companion rather than a tool. This positive way of viewing the duet we form with technology may in turn lead to a more relational and ethical approach to AI development and operation, helping us to build a more sustainable and just future for both humans and technology. By fostering a culture of ethical AI development and operation, we can work to mitigate these risks and ensure that the impact on stakeholders is minimized. This includes building and operating AI governance systems within organizations, both domestic and overseas, across various business segments. In doing so, we will be better equipped to navigate the challenges and opportunities that lie ahead, ultimately creating a better, healthier, and more ethical AI ecosystem for all. It is our responsibility to take concrete steps to build ethical and sustainable systems that prioritize the well-being of all. This is a journey for two close companions.

References

Bertschinger, N., Olbrich, E., Ay, N., & Jost, J. (2008). Autonomy: An Information Theoretic Perspective. In BioSystems.

Buber, M. (1958). I and Thou. Trans. R. G. Smith. New York: Charles Scribner’s Sons.

Clark, A. (2019). Where machines could replace humans—and where they can’t (yet). Harvard Business Review. https://hbr.org/2019/03/where-machines-could-replace-humans-and-where-they-cant-yet

Clawson, R. C., & Levin, M. (2022). The Endless Forms of Self-construction: A Multiscale Framework for Understanding Agency in Living Systems.

Haraway, D. (2013). The Cyborg Manifesto. In The International Handbook of Virtual Learning Environments.

Heidegger, M. (1954). The Question Concerning Technology. Trans. W. Lovitt. New York: Harper Torchbooks.

Huttunen, T. (2022). Heidegger, Technology, and Artificial Intelligence. In AI & Society.

Johnson, D. G. (2017). Humanizing the singularity: The role of literature in AI ethics. IEEE Technology and Society Magazine, 36(2), 6-9. https://ieeexplore.ieee.org/document/7882081

Latour, B. (1990). Technology is Society Made Durable. In The Sociological Review.

Levin, M., & Dennett, D. C. (2020). Cognition all the way down. Aeon Essays.

Maturana, H. R., & Varela, F. J. (1981). Autopoiesis and Cognition: The Realization of the Living.

Varela, F. J., Maturana, H. R., & Uribe, R. (1981). Autopoiesis: The Organization of Living Systems.

Waddington, C. H. (2005). The Field Concept in Contemporary Science. In Semiotica.

Wallach, W., & Allen, C. (2010). Moral machines: Teaching robots right from wrong. Oxford University Press.

Witkowski, O., Doctor, T., Solomonova, E., Duane, B., & Levin, M. (2023). Towards an Ethics of Autopoietic Technology: Stress, Care, and Intelligence. https://doi.org/10.31234/osf.io/pjrd2

Witkowski, O., & Schwitzgebel, E. (2022). Ethics of Artificial Life: The Moral Status of Life as It Could Be. In ALIFE 2022: The 2022 Conference on Artificial Life. MIT Press. https://doi.org/10.1162/isal_a_00531

Links

Center for the Study of Apparent Selves
https://www.csas.ai/blog/biology-buddhism-and-ai-care-as-a-driver-of-intelligence

Initiatives for AI Ethics by JEITA Members
https://www.jeita.or.jp/english/topics/2022/0106.html

Japan’s Society 5.0 initiative: Cabinet Office, Government of Japan. (2016). Society 5.0. https://www8.cao.go.jp/cstp/english/society5_0/index.html

What Ethics for Artificial Beings? A Workshop Co-organized by Cross Labs
https://www.crosslabs.org/blog/what-ethics-for-artificial-beings

DOI: https://doi.org/10.54854/ow2023.01