A true Albertan with an intimate knowledge of all four corners of the province, U of A Distinguished Professor David Wishart was born and raised in Edmonton, with roots deeply embedded in Alberta’s diverse landscapes. His upbringing was steeped in the natural world, from fishing in the mountains near Jasper to hiking the badlands of Drumheller and hunting birds around Wainwright. His father, a wildlife biologist, and his mother, a naturalist and teacher, instilled in him a profound appreciation for exploration and the outdoors. This connection to the land was further strengthened by his Métis heritage and a family history that included farming and involvement in the Alberta Wheat Pool.

He inherited his parents’ passion for the world of science and reflects on a pivotal moment that all but cemented his eventual career choice. “When I was seven or eight years old,” he remembers fondly. “During a family trip to Drumheller, I unearthed dinosaur teeth and bones. Finding this ‘treasure’ ignited a fascination with paleontology and, more broadly, with science, ultimately steering my academic path towards biology, biochemistry, and biophysics.”

Dr. Wishart’s formal post-secondary education began with his studying undergraduate physics at the University of Alberta, exposing him to mathematics, computing science, and what was then called artificial intelligence, but which is closer to the discipline known now as “machine learning (ML).” He then pursued a Ph.D. in biochemistry and biophysics at Yale, receiving an M.Phil. along the way. A change in supervision saw him return to the U of A, where he completed his Ph.D. and eventually a postdoctoral fellowship focusing on protein engineering.

His work in this field built upon the techniques pioneered by Canadian Nobel laureate, Michael Smith, and involved designing proteins to perform new functions by making targeted changes to their genetic sequences. It’s a field with broad practical applications, from developing enzymes for detergents that work effectively in various water temperatures to engineering new forms of insulin and creating a significant portion of modern pharmaceutical drugs, including antibodies and cytokines.

While his formal training wasn’t initially in agricultural research, his lab’s expertise in biochemistry and data analysis led to collaborations with agricultural researchers over the past 10 to 15 years. This included work with agronomists like Derek MacKenzie, evaluating animal health (beef and dairy cattle), and assessing food composition, quality, and safety. These joint projects focused primarily on agricultural products.

The leap to SOIL-HUB was a natural progression from his lab’s long-standing work in data science and machine learning. “I realized that the massive amounts of data we generated in structural biology and biochemistry, coupled with my lab’s proficiency in data generation, database creation, and data mining, could be applied to a pressing agricultural issue: soil health,” he explains.

“I understand the seeming complexity of what we’re doing—and it’s definitely not easy,” he laughs. “But at its core, SOIL-HUB is simply an initiative to gather and organize vast amounts of disconnected soil and other agricultural data that currently reside in various formats, often in isolated spreadsheets or hard drives. Imagine having all the individual parts of a car scattered about. Our team seeks to assemble the disparate pieces into a fully functioning vehicle, which farmers will be able to drive, easily and safely.”

The process involves:

  • Data Normalization and Formatting: Converting diverse data (from farmer-collected excel sheets to satellite imagery) into a standardized, computer-readable format. This often involves conventional programming to handle tedious conversions, like switching rows and columns, or addressing more subtle formatting issues.
  • Data Cleaning and Imputation: Identifying and correcting errors in the data (like wrong numbers) and filling in missing information. For instance, if a farm has precipitation data for April, May, and June but not July and August, SOIL-HUB might use data from a neighbouring farm to fill those gaps.
  • Machine Learning for interpretation and prediction: This is where the “magic” happens. Instead of writing explicit rules for every scenario (like a traditional calculator program for addition), ML algorithms are “trained” by being fed many examples of data.
    • For example, if Professor Wishart provides the system with vast datasets on topography, moisture levels, and corresponding crop yields, the ML program will “learn” the patterns and relationships between these seemingly unrelated factors. It won’t necessarily explain how it learned, but it will be able to predict crop yields based on new inputs of topography and moisture. This is a pattern-detection process that requires a critical mass of data to be effective.
  • User-Friendly Interfaces: Ultimately, SOIL-HUB is about making these complex data and predictive models accessible to everyone, not just computer scientists. Through accessible graphical user interfaces (GUIs), farmers, agronomists, and academics can easily interact with the system. For instance, farmers could input their farms’ geocoordinates on a map of Alberta and instantly get insights into their soil health, projected weather patterns, or even how their crop yields compare to similar farms in the region, potentially identifying areas for improvement.

The funding he’s received for his smart farm project is primarily supporting this crucial data gathering and cleaning process, as much of the existing agricultural data isn’t in a readily usable format. It’s a significant undertaking, but one that promises to unlock invaluable insights for agricultural optimization and sustainability.

At its core, SOIL-HUB succeeds in large measure because it works like a Michelin Three-Star kitchen. “I guess you could say I’m the head chef, Scott MacKay is my sous chef, and the rest of the team are responsible for food prep.” He pauses, then notes, “But that’s where the analogy ends. Instead of specializing on a specific ‘cuisine,’ I’m a scientific generalist, which allows me to connect disparate fields—from physics and biochemistry to machine learning and agriculture.”

When asked what CAAIN/ISED funding has meant to his project, he lists a range of benefits, including recruiting top co-op students, repurposing tools such as heat mapping that are used elsewhere and making them integral to SOIL-HUB. Finally, the support has accelerated development and provided the CAAIN “stamp of approval,” which many funding recipients consistently identify as a significant benefit. Receiving CAAIN/ISED funding means the project team has successfully navigated a rigorous evaluation process, establishing their work’s validity.

Dr. Wishart concludes the interview by explaining that SOIL-HUB is going to morph into SAFE-HUB. “We envision a broader agricultural system that integrates diverse agricultural data, extending beyond soil to include livestock of all kinds across the country. AI will be a core component of this endeavour, cleaning data, providing predictive modeling, and allowing for the use of large language models to operate interactive advisory systems, or chatalytics. In short, SOIL-HUB will become a powerful tool that will allow farmers, agronomists, and other stakeholders to access the power and versatility of AI in their daily activities, making them more productive and profitable.”

Project Lead

Metabolomics Innovations Inc.

Project Partners

Metabolomix Inc.

AltaML

Project Description

SOIL-HUB, a web-accessible resource benefiting prairie farmers and agronomists, will integrate specialized software for comprehensive soil health assessment, monitoring, and management and incorporate advanced machine learning tools for weather prediction, soil management, and quality analysis.

Project Investment

CAAIN Contribution
$86,667

Industry Cash Contribution
$130,000

In-Kind Contribution
$107,500

Total Project Value
$$388,648

Project Contact

Scott MacKay
Research Associate, Faculty of Science – Biological Sciences
samackay@ualberta.ca