Tag Archives: ai

Interspecies Communication—Fantasy or Reality?

Steven Shepard

If you would like to listen to this as an audio essay, complete with the calls of marine mammals, please go here.

In the inaugural issue of National Geographic in 1888, Gardiner Hubbard wrote, “When we embark on the great ocean of discovery, the horizon of the unknown advances with us and surrounds us wherever we go. The more we know, the greater we find is our ignorance.” Gardiner was the founder and first president of the National Geographic Society, the first president of AT&T, and a founder of the prestigious journal, Science. He knew what he was talking about. 

104 years later, NASA scientist Chris McKay had this to say: “If some alien called me up and said, ‘Hello, this is Alpha, and we’d like to know what kind of life you have,’ I’d say, water-based. Earth organisms figure out how to make do without almost anything else. The single non-negotiable thing life requires is water.”

Several years ago I met Alaska-based sound recordist and author Hank Lentfer. During one of several conversations, he asked me to imagine that all the knowledge we each have is ‘inside of this circle,’ which he drew in the air. “Everything that touches the outside the circle is what we don’t know,” he told me. “But as we learn, the circle gets bigger, as more knowledge is added to it. But notice also, that as the circle gets bigger, so too does its surface area, which touches more of what we don’t know.” In other words, the more we know, the more we don’t. As Hubbard said in 1888, “The more we know, the greater we find is our ignorance.”

I love science, and one of the things about it that I love the most is how it constantly refreshes itself in terms of what’s new and fresh and exciting and worthy of exploration as a way to add to what’s inside Hank’s circle. Think about it: Over the last twenty years, scientists have mapped the entire human genome; developed CRISPR/CAS9 to do gene editing; created synthetic cells and DNA; discovered the Higgs Boson, gravitational waves, and water on Mars; developed cures or near-cures for HIV, some forms of cancer, and Hepatitis-C; and created functional AI and reusable rockets. And those are just the things I chose to include.

One of the themes in my new novel, “The Sound of Life,” is interspecies communication—not in a Doctor Doolittle kind of way—that’s silly—but in a more fundamental way, using protocols that involve far more listening on our part than speaking.

The ability to communicate with other species has long been a dream among scientists, which is why I’m beyond excited by the fact that we are close to engaging in a form of two-way communication with other species. So, I want to tell you a bit about where we are, how we got here, and why 2026 is widely believed to be the year we make contact, to steal a line from Arthur C. Clarke.

Some listeners may remember when chimps, bonobos and gorillas were taught American Sign Language with varying degrees of success. Koko the gorilla, for example, who was featured on the cover of National Geographic, learned to use several dozen American Sign Language symbols, but the degree to which she actually understood what she was saying remains a hotly-debated topic, more than 50 years later. 

But today is a very different story. All the research that’s been done so far in interspecies communication has been based on trying to teach human language to non-human species. Current efforts turn that model on its head: researchers are using Artificial Intelligence to meet animals on their own terms—making sense of their natural communications rather than forcing them to use ours. Said another way, it’s time we learned to shut up and listen for a change. And that’s what researchers are doing.

There have been significant breakthroughs in the last few years, many of them the result of widely available AI that can be trained to search for patterns in non-human communications. Now, before I go any further with this, I should go on record. Anybody who’s a regular listener to The Natural Curiosity Project Podcast knows that I don’t take AI with a grain of salt—I take it with a metric ton of it. As a technologist, I believe that AI is being given far more credit than it deserves. I’m not saying it won’t get there—far from it—but I think humans should take a collective breath here. 

I’ve also gone on record many times with the observation that ‘AI’ as an abbreviation has been assigned to the wrong words. Instead of being associated with Artificial and Intelligence, I think AI should stand for Accelerated Insight, because that’s the deliverable that it makes available to us when we use it properly. It makes long, slow, complex, and let’s face it, boring jobs, usually jobs that involve searching for patterns in a morass of data, enormously faster. Here’s an example that I’ve used many times. A dermatologist who specializes in skin cancers has thousands of photographs of malignant skin lesions. She knows that the various forms of skin cancer can differ in terms of growth rate, shape, color, topology, texture, surface characteristics, and a host of other identifiers. She wants to look at these lesions collectively to find patterns that might link causation to disease. Now: she has a choice. She can sit down at her desk with a massive stack of photographs and a note pad, and months later she may have identified repeating patterns. Or, she asks an AI instance to do it for her, and gets the same results in five minutes. It’s all about speed.

That’s a perfect application for AI, because it takes advantage of AI’s ability to quickly and accurately identify patterns hidden within a chaos of data. And that’s why research into interspecies communication today is increasingly turning to AI as a powerful tool—with many promising outcomes, and a few spellbinding surprises. 

Let’s start with the discovery of the “Sperm Whale Phonetic Alphabet.” Project CETI,  the Cetacean Translation Initiative, has produced what bioacoustics researchers are calling the “Rosetta Stone” for marine interspecies communication. Here’s what we know. Researchers have identified structural elements in the sounds generated by sperm whales that are similar  to human vowels, like a, e, i, o, and u, and diphthongs, like the ‘ow’ in the word sound, the ‘oy’ in noise, and the ‘oo’ in tour. They’ve also identified a characteristic called “rubato,” which is measurable variation in tempo that conveys meaning, and “ornamentation,” which is the addition of extra clicks, which suggest that sperm whales may have what’s called a combinatorial grammar that could transmit enormous amounts of information. Combinatorial grammar: let me explain. “He had a bad day” is a perfectly acceptable statement. “He had a no-good, horrible, terrible, very bad day” is an example of combinatorial grammar. It adds richness and nuance to whatever’s being said.

This is the first time researchers have ever found a non-human communication system that relies on the same kinds of phonetic building blocks that human speech relies on. This is a very big deal.

So: Using machine learning, scientists have analyzed almost 9,000 codas, which are uniquely identifiable click sequences, and in the process have discovered that sperm whale communication is enormously more complicated and nuanced than we previously believed.

So, how did they do it? Well, in the same way that ChatGPT is trained on huge databases of human text, new models are being trained on the sounds of the natural world. For example, NatureLM-audio is a system that was launched by the Earth Species Project in 2025. It’s the first large audio-language foundation model specifically built for bioacoustics. Not only can it identify unique species, it can also determine the life stage they’re in and the emotional state of the animal when it was recorded—for example, whether the creature was stressed, playing, relaxed, and so on. And it can do this across thousands of species, simultaneously.

Then there’s WhAM,  the Whale Acoustics Model. This is a transformer-based model that can generate synthetic, contextually accurate whale codas, which could someday lead to two-way real-time engagement with whales.

I should probably explain what a transformer-based model is, because it’s important. In bioacoustics, a transformer-based model uses a technique called the self-attention mechanism, which is part of natural language processing, to analyze animal sounds. The self-attention mechanism asks a question: To truly understand the context and meaning of this particular word (or in this case, sound), what do I need to know about the other words that are being used by the speaker at the same time? This allows the system to capture long-range patterns in audio spectrograms, which allow for highly accurate species identification, sound event detection (like bird calls or bat echolocation), and other identifiers, especially when the data to be analyzed is limited. Models like Audio Spectrogram Transformer and custom systems like animal2vec convert captured audio into small segments called patches, then process them to identify patterns. 

In bioacoustics—such as studying the meaning and context of whale song, or in the case of animal2vec, the vocalizations of meerkats—the raw audio is converted into visual representations called spectrograms, which display the changing frequency of the recording against elapsed time. These are then broken into smaller patches. Each patch then gets a unique “position” tag so the model knows the order of the sounds in the sequence. This is called Positional Encoding.

Next, the system unleashes the Self-Attention Mechanism, which allows the model to weigh the importance of different sound patches relative to each other, which creates a better understanding of context and relationships across long audio segments.

The next step is Feature Extraction. The model learns deep, complex acoustic features, such as nuanced meaning in bird songs or bat calls, which can be tagged to different species or behaviors.

Finally, the model classifies the sounds, in the process identifying unique species, or detecting specific identifiable events, such as a predator call.

The implications of all this are significant. First, contextual understanding is created because the system captures long-range dependencies among the audio patches, which are crucial for understanding complex animal vocalizations. Second, the performance of these systems is better than any other model, including Convoluted Neural Networks, which are considered to be on the forefront of AI learning. Third, it works well in what is called a Few-Shot Learning Environment, which is an environment where the amount of labeled data that can be analyzed is limited. And finally, because of the use of the Self-Attention Mechanism, the system creates a high degree of interpretability.

These are tools which have utility well beyond the moonshot project of interspecies communication. They can be used to monitor wildlife populations through sound alone; they can detect and identify bird and bat calls; and they can even be used to identify bee species from the frequency and tonality of the buzz their wings create when they fly by a microphone. Remember— Karl von Frisch won a Nobel Prize for deciphering the complex dance of the honeybee and how that dance conveys complex, species-specific information to other members of the hive.

All of these are important in the world of ecology and habitat monitoring and protection. 

Here’s another fascinating example that has gotten a lot of attention. Recent field studies have proven that elephants and carrion crows engage in a form of unique naming behavior for others of their own species. In the case of elephants, researchers have used machine learning tools to prove that wild African elephants use unique vocal labels to address each other. Unlike dolphins, who mimic other dolphins’ whistles, elephants appear to use arbitrary names for other elephants—a sign of advanced abstract thought.

In the case of crows, researchers using miniature biologgers—essentially tiny microphones and recorders about the size of a pencil eraser that are attached to wild animals—have discovered that carrion crows have a secret “low-volume” vocabulary that they use for intimate family communication, very different from the loud, raucous sounds that are used for territory protection and alarm calls.

Finally, we’re seeing breakthroughs in animal welfare practices in the farming and ranching industries because of bioacoustics. In the poultry business, for example, a “chicken translator” is now in use that can identify specific distress calls, which allow farmers to locate sick or stressed birds among thousands, significantly improving the welfare of the flock.

Before I continue with this discussion, let’s talk about why all this is happening now, in the final days of 2025, and why scientists believe we may be on the verge of a major breakthrough in interspecies communications. It has to do with three factors. 

First, we have Big Data, both as a theory and as a hard practice. The idea that patterns can be found in massive volumes of data has been around for a while, but we’re just now developing reliable tools that can predictably find those patterns and make sense of them. Initiatives like the Earth Species Project are aggregating millions of hours of animal audio into a single database, which can then be analyzed.

Second, we have data aggregation techniques and mechanisms that allow for data to be collected around the clock, regardless of climate or weather. The tiny biologgers I mentioned earlier are examples, as are weatherproof field recorders that can record for weeks on a single memory card and set of batteries.

Finally, we have one of the basic characteristics of AI, which is unsupervised learning—the ability to find patterns in vast stores of data without being told what to look for.

I’m going to add a fourth item to this list, which is growing professional recognition that sound is as good an indicator of ecosystem details as sight. I may not be able to see that whale in the ocean, but I can hear it, which means it’s there.

Okay, let’s move on and talk about the nitty-gritty: how do those sperm whale vowel sounds that I described earlier actually work? And to make sure you know what I’m talking about, here’s what they sound like. This recording comes from Mark Johnson, and it can be found at the “Discovery of Sound in the Sea” Web site.

Amazing, right? Some scientists say it’s the loudest natural sound in the ocean. Anyway, to answer this question about how the sperm whale vowel sounds work,  we have to stop thinking about sound as a “message” and start looking at its internal architecture. Here’s what I mean. For decades, researchers believed that the clicks made by sperm whales, the codas, were like Morse Code: a simple sequence of on/off pulses, kind of like a binary data transmission. However, in 2024 and 2025, Project CETI discovered that the clicks made by sperm whales have a sophisticated internal structure that functions exactly the way vowels do in human speech.

In the same way that human speech is made up of small units of sound called phonemes, whale codas are characterized by four specific “acoustic dimensions.” By analyzing thousands of hours of recorded whale song, researchers using AI determined that whales mix and match these dimensions to create thousands of unique signals. The four dimensions are rhythm, which is the basic pattern of the clicks; tempo, the overall speed of the coda; rubato, which is the subtle stretching or squeezing of time between clicks; and ornamentation, which are short “extra” clicks added at the end of a sequence, similar to a suffix or punctuation mark.

That discovery was a game-changer, and it really knocked the researchers back on their heels. But the most important discovery, which happened in late 2024, was the identification of formants in whale speech. In human language, formants are the specific resonant frequencies created in the throat and mouth that result in the A, E and O vowel sounds. Well, researchers discovered that whales use their “phonic lips,” which are vocal structures in their nose, to modulate the frequency of their clicks in the same way that humans do with their lips and mouth. For example, the a-vowel is a click with a specific resonant frequency peak. The i-vowel is a click with two distinct frequency peaks. Whales can even “slide” the frequency in the middle of a click to create a rising or falling sound similar to the “oi” in “noise” or the “ou” in “trout.” These are called diphthongs.

So, how does this actually work? It turns out that whale vocalization is based on what linguists call Source-Filter Theory. Compared to human language, the similarities are eerie. In human speech, air passes through vocal chords to create sound; in whales, it passes through what are called phonic lips. In human speech, variation is accomplished by changing the shape of the mouth and tongue; in sperm whales, it happens using the spermaceti organ and nasal sacs.

In humans, the result is recognizably-unique vowels, like A, E, I, O, U; in whales, the result is a variety of spectral patterns. And in terms of complexity, there isn’t much difference between the two. Humans generate thousands of words; whales generate thousands of codas.

So … the ultimate question. Why do we care? Why does this research matter? Why is it important? Several reasons. 

First, before these most recent discoveries about the complexity of animal communication, scientists believed that animal “language”—and I use that word carefully—was without nuance. In other words, one sound meant one thing, like ‘food’ or ‘danger’ or ‘come to me.’ But the discovery of these so-called “whale vowels” now make us believe that their language is far more complex and is in fact combinatorial—they aren’t just making a sound; they’re “building” a meaningful signal out of smaller parts, what we would call phonemes. This ability is a prerequisite for true language, because it allows for the creation of an almost infinite variety of meanings from a limited set of sounds.

So: one of the requirements for true communication is the ability to anticipate what the other person is going to say before they say it. This is as true for humans as it is for other species. So, to predict what a whale is going to say next, researchers use a specialized Large Language Model called WhaleLM. It’s the equivalent of ChatGPT for the ocean: In the same way that ChatGPT uses the context of previous words in a conversation to predict what the next word will be in a sentence, WhaleLM predicts the next coda or whale song based on the “conversation history” of the pod of whales to which the individual belongs. Let me explain how it works.

Large Language Models, the massive databases used to train AI, rely on a process called ‘tokenization.’ A token is a unit of the system—like a word, for example, or in the case of sperm whales, the clicks they make. Since whale clicks sound like a continuous stream of broadband noise to humans, researchers use AI to “tokenize” the whale audio into unique, recognizable pieces. The difference, of course, is that they don’t feed text into the LLM, because text isn’t relevant for whales. Instead, they feed it the acoustic dimensions we talked about earlier: Rhythm,  Tempo, Rubato, and Ornamentation.

Next comes the creation of a vocabulary. From analysis of the four acoustic dimensions, the AI identifies specific sound sequences, which are then treated as the vocabulary of the pod that uttered the sounds in the first place.

Next comes the creation of context, or meaning. WhaleLM made a critical discovery in late 2024, which was the identification of what are called long-range dependencies. These dependencies are described in what researchers call the “Eight Coda Rule.” Scientists determined conclusively that a whale’s next call is heavily influenced by the previous eight codas in the conversation, which is typically about 30 seconds or so of conversation time. 

WhaleLM also has the benefit of multi-whale awareness. It doesn’t track the “speech” of a single whale; it tracks and analyzes the sounds uttered by all whales in the pod and the extent to which they take turns vocalizing. If Whale A says “X,” the model can predict with high accuracy whether Whale B will respond with “Y” or “Z.” But here’s a very cool thing that the researchers uncovered: Not only does WhaleLM predict a sound that will soon follow, it also predicts actions that the sounds are going to trigger. For example, researchers identified a specific sequence of codas, called the diving motif, that indicates with extreme accuracy—like 86 percent accuracy—that if uttered by all the whales in an exchange, the pod is about to dive to hunt for food. In other words, these sound sequences aren’t just noise—the equivalent of whales humming, for example—they’re specific instructions shared among themselves with some intended action to follow. I don’t know about you, but I find that pretty mind-blowing.

The natural next step, of course, is to ask how we might use this analytical capability to carry on a rudimentary conversation with a non-human creature. Because researchers can now predict what a “natural” response should be, they can use WhaleLM to design what are called Playback Experiments. Here’s how they work. Researchers play an artificial coda, generated by WhaleLM, to a wild whale to see if the whale responds the way the AI predicts it might. If the whale does respond, it confirms that the researchers have successfully decoded a legitimate whale grammar rule.

Let’s be clear, though. We don’t have a “whale glossary of terms” yet that we can use to translate back and forth between human language and whale language. What we have are the rules. We’re still in the early stage of understanding syntax—how words are constructed. We aren’t yet into the semantics phase—what words mean.

In the leadership workshops I used to deliver I would often bring up what I called “The Jurassic Park Protocol.” It simply said, just because you CAN make dinosaurs doesn’t mean you SHOULD. And we know they shouldn’t have, because there are at least six sequels to the original movie and they all end badly.

The same rule applies to interspecies communication. Just because we may have cracked the code on some elements of whale communication doesn’t mean that we should inject ourselves into the conversation. This is heady stuff, and the likelihood of unintended consequences is high. In 2025, researchers from Project CETI and the More-Than-Human Life Program at NYU, MOTH, introduced a formal Ethical Roadmap known as the PEPP Framework. PEPP stands for Prepare, Engage, Prevent, and Protect, and it treats whales as “subjects” with rights rather than “objects” to be studied.

So, PEPP stipulates four inviolable commitments that researchers must meet before they’re allowed to engage in cross-species conversations using AI-generated signals. The first is PREPARE: Before a sound is played back to a whale, researchers must prove they have minimized the potential for risk to the animal by doing so. For example, scientists worry that if they play an AI-generated whale call, they might inadvertently say something that causes panic, disrupts a hunt, or breaks a social bond. Similarly, PEPP requires that researchers use equipment that doesn’t add noise pollution that interferes with the whales’ natural sonar. We’ll talk more about that in a minute.

The next commitment is ENGAGE. To the best of our current knowledge, whales don’t have the ability to give us permission to engage with them, so PEPP requires researchers to look for any kind of identifiable behavioral consent. If the whale demonstrates evasive behavior such as diving, moving away, or issuing a coda rhythm that indicates distress, the experiment must stop immediately. The ultimate goal is to move toward a stage called Reciprocal Dialog, in which the whale has the right and ability to end the conversation at any time.

The third pillar of the PEPP protocol is PREVENT. This is very complicated stuff: researchers must take deliberate steps to ensure that they do not inadvertently become members of the pod. There is concern, for example, that whales might become “addicted” to interacting with the AI, or that it might change how they teach their calves to speak. A related concern is Cultural Preservation. Different whale pods have different “dialects,” and PEPP forbids researchers from playing foreign dialects to groups of whales—for example, playing a recording captured in the Caribbean to a pod of whales in the Pacific Ocean—because it could contaminate their own vocal culture.

The final commitment is PROTECT, and it has less to do with the process of establishing communication and more to do with what occurs after it happens. The PEPP protocol argues that if we prove whales have a language, then we’re ethically and morally obligated to grant them legal rights. And, since AI can now “eavesdrop” on private pod conversations, PEPP establishes data privacy rules for the whales, ensuring their locations aren’t shared with commercial fisheries or whaling interests.

There’s an old joke about what a dog would do if it ever caught the car it was chasing. The same question applies to the domain of interspecies communication. If we are successful, what should we say? Most researchers agree that first contact should not be a casual meet and greet, but should instead be what are called mirroring experiments. One of these is called the Echo Test, in which the AI listens to a whale and repeats back a slightly modified version of the same coda. The intent is not to tell the whale something new, but to see if they recognize that the “voice” in the water is following the rules of their grammar. It’s a way of asking, “Do you hear me?” Instead of “How you doin’?”

Researchers have identified three major risks that must be avoided during conversational engagement with whales. The first is the risk of social disruption. To avoid this, only “low-stakes” social codas can be used for playback, never alarm or hunt calls. 

The second risk is human bias. To avoid this outcome, the AI is trained only on wild data to avoid “human-sounding” accents in the whale’s language.

Finally, we have the very real risk of exploitation. To prevent this from happening, the data is open-source but “de-identified” to protect whale locations from poachers.

The discovery of vowels in whale speech has given lawyers who advocate for whale rights significant power in the courtroom. For centuries, whales have been classified as property—as things rather than as sentient creatures. Recently, though, lawyers have begun to argue that whales meet the criteria for legal personhood. They base this on several hard-to-deny criteria. For example, lawyers from the More-Than-Human Life Program at NYU and the Nonhuman Rights Project are moving away from general “sentience” arguments to specific “communication” arguments. If an animal has a complex language, it possesses autonomy—the ability to make choices and have preferences. In many legal systems, autonomy is the primary qualification for having rights.

Another argument makes the case that by proving that whales use combinatorial grammar—the vowels we’ve been discussing—scientists have provided evidence that whale thoughts are structured and abstract. Lawyers argue that the law can’t logically grant rights to a human with limited communication skills, like a baby, while at the same time denying them to a whale with a sophisticated “phonetic alphabet.” 

In March 2024, Indigenous leaders from the Māori of New Zealand, Tahiti, and the Cook Islands signed a treaty which recognizes whales as legal persons with the right to “cultural expression.” That includes language. Because we now know that whales have unique “regional dialects,” the treaty argues that whales have a right to their culture. This means that destroying a pod isn’t just killing animals; it amounts to the “cultural genocide” of a unique linguistic group.

Then, there’s the issue of legal representation of whales in a court of law. We have now seen the first attempts to use AI-translated data as evidence in maritime court cases. For example, in late 2025, a landmark paper in the Ecology Law Quarterly argued that human-made sonar and shipping noise amounts to “torture by noise” and is the acoustic equivalent of “shining a blinding light into a human’s eyes 24 hours a day.” And instead of relying on the flimsy argument that whales can just swim away from noise (clearly demonstrating a complete ignorance of marine acoustics and basic physics), lawyers are using WhaleLM data to demonstrate how human noise disrupts their vowels, making it impossible for whales to communicate with their families. And the result? We’re moving from a world where we protect whales because they’re pretty, to a world where we protect them because they’re peers. 

Human-generated noise has long been a problem in the natural world. Whether it’s the sound of intensive logging in a wild forest, or noise generated by shipping or mineral exploration in the ocean, there’s significant evidence that those noises have existentially detrimental effects on the creatures exposed to them—and from which they can’t escape. The good news is that as awareness has risen, there have been substantial changes in how we design underwater technology so that it is more friendly to marine creatures like whales. Essentially, there is a shift underway toward  Biomimetic Technology—hardware that mimics how whales communicate as a way to minimize the human acoustic footprint. These include the development of acoustic modems that use transmission patterns modeled after whale and dolphin whistles instead of the loud sonar pings used in traditional technology. Whales and other creatures hear it as background noise.

Another advance is the use of the SOFAR Channel. SOFAR is an acronym that stands for Sound Fixing and Ranging, and it refers to a deep layer in the ocean, down around 3,300 feet, where sound travels for great distances, much farther than in other regions of the ocean. The layer acts as a natural waveguide that traps low-frequency sounds, allowing them to travel thousands of miles and enabling long-distance monitoring of phenomena such as whale communication.  Technology is now being designed to transmit over the SOFAR layer, allowing marine devices to use 80% less power by working with the ocean’s physics rather than against it, and at the same time being less disruptive to the creatures who live there.

Gardiner Hubbard said, “When we embark on the great ocean of discovery, the horizon of the unknown advances with us and surrounds us wherever we go. The more we know, the greater we find is our ignorance.” Interspecies communication is a great example of this. The more we learn, the more we unleash our truly awesome technologies on the challenge of listening to our non-human neighbors, the more we realize how much we don’t know. I’m good with that. Given the current state of things, it appears that 2026 may be the year when the great breakthrough happens. But the great question will be, when given the opportunity to shut up and listen, will we?

The Research Myth

I recently had a conversation about technology’s impact on the availability and quality of information in the world today. It’s an argument I could make myself—that tech-based advances have resulted in access to more data and information. For example, before the invention of moveable type and the printing press, the only books that were available were chained to reading tables in Europe’s great cathedrals—they were that rare and that valuable. Of course, it was the information they contained that held the real value, an important lesson in today’s world where books are banned from modern first world library shelves because an ignorant cadre of adults decides that young people aren’t mature enough to read them—when it’s the adults who lack the maturity to face the fact that not everybody thinks the same way they do in this world, and that’s okay. But, I digress.  

Image of chained books in Hereford Cathedral. Copyright Atlas Obscura.

When moveable type and the printing press arrived, book manuscripts no longer had to be copied by hand—they could be produced in large quantities at low cost, which meant that information could be made available to far more people than ever before. To the general population—at least, the literate ones—this was a form of freedom.But to those who wanted to maintain a world where books were printed once and kept chained to desks where only the privileged few (the clergy) could read them, the free availability of knowledge and information was terrifying. Apparently, it still is. Knowledge is, after all, the strongest form of power. How does that expression go again? Oh yeah: Freedom of the Press…Freedom of Expression…Freedom of Thought…Sorry; I digress. Again.

Fast-forward now through myriad generations of technology that broadened information’s reach: The broadsheet newspaper, delivered daily, sometimes in both morning and evening editions. The teletype. Radio. The telephone. Television. The satellite, which made global information-sharing a reality. High-speed photocopying. High-speed printing. The personal computer and desktop publishing software. Email. Instant Messaging and texting. And most recently, on-demand printing and self-publishing through applications like Kindle Direct, and of course, AI, through applications like ChatGPT. I should also mention the technology-based tools that have dramatically increased literacy around the world, in the process giving people the gift of reading, which comes in the form of countless downstream gifts.

The conversation I mentioned at the beginning of this essay took a funny turn when the person I was chatting with tried to convince me that access to modern technologies makes the information I can put my hands on today infinitely better and more accurate. I pushed back, arguing that technology is a gathering tool, like a fishing net. Yes, a bigger net can result in a bigger haul. But it also yields more bycatch, the stuff that gets thrown back. I don’t care about the information equivalents of suckerfish and slime eels that get caught in my net. I want the albacore, halibut, and swordfish. The problem is that my fishing net—my data-gathering tool—is indiscriminate. It gathers what it gathers, and it’s up to me to separate the good from the bad, the desirable from the undesirable.

What technology-based information-gathering does is make it easy to rapidly get to AN answer, not THE answer.

The truth is, I don’t have better research tools today than I had in the 70s when I was in college. Back then I had access to multiple libraries—the Berkeley campus alone had 27 of them. I could call on the all-powerful oracle known as the reference librarian. I had access to years of the Reader’s Guide to Periodical Literature. I had Who’s Who, an early version of Wikipedia; and of course, I had academic subject matter experts I could query. 

Technology like AI doesn’t create higher quality research results; what technology gives me is speed. As an undergraduate studying Romance Languages, I would often run across a word I didn’t know. I’d have to go to the dictionary, a physical book that weighed as much as a Prius, open it, make my way to the right page, and look up the word—a process that could take a minute or more. Today, I hover my finger over the word on the screen and in a few seconds I accomplish the same task. Is it a better answer? No; it’s exactly the same. It’s just faster. In an emergency room, speed matters. In a research project, not so much. In fact, in research, speed is often a liability.

Here’s the takeaway from this essay. Whether I use the manual tools that were available in 1972 (and I often still do, by the way), or Google Scholar, or some other digital information resource, the results are the same—not because of the tool, but because of how I use what the tool generates. I’ve often said in my writing workshops that “you can’t polish a turd, but you can roll it in glitter.” Just because you’ve written the first draft of an essay, selected a pleasing font, right and left-justified the text, and added some lovely graphics, it’s still a first draft—a PRETTY first draft, but a first draft, nonetheless. It isn’t anywhere near finished.

The same corollary applies to research or any other kind of news or information-gathering activity. My widely cast net yields results, but some of those results are bycatch—information that’s irrelevant, dated, or just plain wrong. It doesn’t matter why it’s wrong; what matters is that it is. And this is where the human-in-the-loop becomes very important. I go through the collected data, casting aside the bycatch. What’s left is information. To that somewhat purified result I add a richness of experience, context, skepticism, and perspective. Ultimately I generate insight, then knowledge, and ultimately, wisdom. 

So again, technology provides a fast track to AN answer, but it doesn’t in any way guarantee that I’ve arrived at anything close to THE answer. Only the secret channels and dark passages and convoluted, illuminated labyrinths of the human brain can do that. 

So yeah, technology can be a marvelous tool. But it’s just a tool. The magic lies in the fleshware, not the hardware. Technology is only as good as the person wielding it. 

The Dubious Value of Interspecies Communications

Like most young 19th-century boys, Hugh Lofting liked animals and playing outdoors. Born in 1886 in Maidenhead, in England’s Berkshires, he had his own little natural history museum and zoo when he was six years old. The fact that it was in his mom’s bedroom closet wasn’t a problem until she found it there.

The point is, Hugh loved nature, and everyone who knew him was convinced that he’d become a naturalist, or biologist, or something in a related field, when he grew up. So, everybody was surprised when he decided to study civil engineering. He started at MIT near Boston and completed his degree at London Polytechnic. When he graduated, he got work in the field: prospecting and surveying in Canada, working on the Lagos Railway in west Africa, then on to the Railway of Havana in Cuba. After traveling the world, he decided that a career change was in his future. He married, settled down in New York City, had kids, and began to write articles for engineering magazines and journals about topics like, ‘building culverts.’

In 1914, World War I, ‘The Great War, The War to End All Wars,’ broke out, and Hugh was commissioned as a lieutenant in the Irish Guards. He fought in Belgium and France, and the horrors of war affected him deeply. In fact, his feelings about the natural world once again came to the surface, as he witnessed the treatment of draught animals in the war. Their suffering affected him as much as the suffering of his fellow soldiers. 

To help himself deal with the emotional trauma of war, he returned to his writing. He began to compose letters to his two children about a mythical, magical doctor who took care of animals, curing them of whatever malady had beset them.

In 1918, Hugh was badly wounded when a piece of shrapnel from a hand grenade shredded his leg. He left the military and after recovering from his injuries in England, returned to his family in New York.

Serendipity definitely played a role in the direction of Hugh Lofting’s life. His wife, charmed by the letters he wrote to his children while he was deployed, had kept them, and suggested he turn them into a book. He did. It was called, “The Story of Doctor Dolittle: Being the History of His Peculiar Life at Home and Astonishing Adventures in Foreign Parts.”

The book was an immediate bestseller, and between 1922 and 1928, he wrote a new Doctor Doolittle book every year, along with other titles. 

Interesting story—it’s always fun to hear how a writer finds the track that defines their life’s work. But that’s not what I want to talk about here. I just finished re-reading Doctor Doolittle for the first time in a long time (I love children’s books), but I also just finished reading Ed Yong’s “An Immense World: How Animal Senses Reveal the Hidden Realms Around Us.” I didn’t plan it that way; they just happened to pop up in my reading stream, and much like Hugh Lofting, serendipity kicked in. Doctor Doolittle could talk to animals; Ed Yong writes extensively in his book about the extraordinary ways that non-human species communicate. In fact, there’s been a lot of chatter in the press lately about advances in interspecies communication and our soon-to-be-available ability to translate what our non-human neighbors are saying. That’s quite a breakthrough, considering how much trouble I often have understanding what other HUMANS are saying.

Before I get too far into this, let’s lay down some basics. We are NOT the only species that communicates, nor are we the only species that uses body language. Lots of animals do that. Orangutans, for example, often use pantomime with each other, and even with their human caregivers in orangutan rescue centers. And after recording thousands of hours of sound and observing the behavior of herds of elephants over a long period, researchers have determined that elephants have a specific call that means, ‘Bees—Run!!!’ In fact, there may be a form of interspecies communication going on here. When African wild dogs show up, one of the fiercest and most dangerous predators in all of Africa, elephants have a specific warning call which also causes other animals, like gazelle and impala, to take notice and run. But when elephants bellow about bees or other things, calls that sound just as urgent, they don’t even flinch. They just keep grazing, entirely unconcerned.

Monkeys do similar things. Vervets, the annoying little monkeys that once invaded and destroyed my room at an African game preserve in search of the sugar packets that had been left for coffee, have distinct calls for distinct scenarios. If one of them sees a land-based predator, like a leopard, they issue a specific call and everybody takes to the trees. If they see an aerial predator, like a crowned eagle, a distinctly different call sends the troop into the safety of ground cover. 

Some species even add nuance and meaning to their calls by changing the order of the sounds they make. For example, if west African Campbell’s monkeys begin their threat calls with a deep booming sound, it means that whatever threat they’re seeing is still far away, but pay attention—be aware. If they start the call without the booming sound, it means that the threat is close and that whoever hears it should take cover immediately. 

Sixty years ago, Roger Payne, a bioacoustics researcher at Tufts University who spent his time listening to the calls of moths, owls and bats, met a naval engineer who monitored Soviet submarine activity using hydrophones scattered across the sea floor. The engineer told Payne about sounds he had recorded that weren’t submarines, and after playing them for him, Payne was gobsmacked. He asked for and was given a copy of the sounds, which turned out to be made by humpback whales, and after listening to them over and over for months, he began to detect that the sounds, which were extremely diverse, had a structure to them. He loaded the audio files into a software package capable of producing a spectrogram, which is a visual representation of a sound, using time on the X-axis and frequency on the Y. By the way, this required a partnership with IBM to get access to a mainframe computer to do the analysis. Anyway, what his analysis confirmed was that whales call in a very specific order of unique vocalizations. Sometimes a call lasts 30 seconds, sometimes thirty minutes, but the sequence is always the same—identifiable sequences that he called songs. In fact, in 1970, Payne published his recordings as an album called Songs of the Humpback Whale. It went multi-platinum, selling more than 125,000 copies and catalyzing the effort to end commercial whaling around the world. Some of its tracks were included on the gold album attached to Voyagers 1 and 2 when they were launched into deep space in 1977.

Most recently, researchers have taken their analysis of animal sounds even farther, using AI to identify more complex patterns. Shane Gero is a Carleton University researcher who for the last 20 years has studied the vocalizations of sperm whales. After analyzing hundreds of hours of recordings, he and his team identified specific characteristic patterns that he called codas. It appears that the whales use these unique sounds to identify each other. He and his team are now feeding the sounds they’ve captured into a large language model that they will then unleash AI against in an effort to enhance our understanding of whale speak.

That’s remarkable—stunning, in fact. But speaking for myself, I feel inclined to invoke what I call the Jurassic Park Effect: Just because you can doesn’t mean you should. In the movie, researchers re-created dinosaurs from the DNA found in dinosaur blood in the stomachs of Jurassic mosquitos that were trapped in amber. They did it because they could, ignoring whether or not they should, and it didn’t end well. In fact, none of the sequels did—for humans, anyway. Creating a large language model to translate other species’ languages into human language strikes me as the same thing. Because when it happens, the conversation might go something like this:

‘Hey—nice to meet you! We’re the creatures who violently kick you out of your homes and then tear them down because we want to live there instead; we destroy your food sources; we blast loud noises into your marine homes 24 hours a day; we capture and eat huge numbers of you; we pour countless toxins into your air and water and soil; we build huge dams on your rivers to prevent you from migrating home as you’ve done for thousands of years; we do all kinds of things to help to make the environment hotter and unpredictably violent; and we make your terrestrial habitat so noisy that you can’t hear predators coming or mates calling. So with that introduction, how ya doin’? What shall we talk about?’

I don’t know. Maybe it’s just me, but I don’t think we’re gonna like what they have to say.