RT3: Edge Computing for AI

Theme Leaders: Dr Tomasz Szydlo & Dr Blesson Varghese

Researchers: Dr Jennifer Williams, Prof Julie McCann, Prof Gopal Ramchurn, Prof Qi Wang and Prof Jose M. Alcaraz Calero

RT3 aims to find ways to make complex Edge AI models work smoothly on different types of edge computing systems. These systems can be very different from each other, which makes it hard for Edge AI models to work well on all of them.

One idea to make this easier is TinyML, a kind of AI model designed for edge computing. But there are still problems with TinyML. For example, it’s hard to know how well TinyML will work on different types of edge devices. Also, TinyML doesn’t always handle changes in the system well, like when there are problems with the internet connection or the devices are under attack.

To solve these problems, we’ll work on two main things. First, we will figure out how to test Edge AI models on different edge devices to see how well they work. This will help developers know which models are best for which devices. Second, we will create a new way for Edge AI models to adjust themselves based on what’s happening in the system. This will help them keep working well even when there are problems.

Our goal is to make it easier for developers to build and use Edge AI models on edge computing systems. We will create tools and techniques to help them do this, so Edge AI can be used in more places and work better for everyone.

Case Studies for RT3

TinyML Framework for Embedded Devices

The FogML is a set of tools enabling TinyML on microcontrollers as low resource-limited as ARM M0 cores. In contrast to many other frameworks, the FogML utilises classic machine learning methods such as density-based anomaly detection and classifiers based on Bayesian networks, decision forests and vanilla MLP. In addition, it supports off-device learning for the classification problem and on-device learning for anomaly detection. The active learning anomaly detection is based on reservoir sampling and outlier detection algorithms which are trained directly on the device. The dedicated library performs the time series processing on the devices, computing the feature vector consisting of RMS, FFT, amplitude and other low-level signal metrics. One of the techniques used in FogML is source code generation of the inferencing functions for embedded devices. It leads to a much smaller memory footprint than for more computationally advanced solutions such as deep neural networks.

Examples of anomaly detection and classification algorithms provided by the FogML project for embedded devices:

FogML-SDK [https://github.com/tszydlo/fogml_sdk]
FogML Arduino [https://github.com/tszydlo/FogML-Arduino]
FogML Zephyr OS [https://github.com/tszydlo/FogML-Zephyr]
FogML ISPU [https://github.com/tszydlo/FogML-ISPU]

TinyML lifecycle Management with FOTA using LwM2M protocol:

FogML-Zephyr-FOTA [https://github.com/tszydlo/FogML-Zephyr-FOTA]

NeuroFlux: Memory-efficient Training for Small Devices

NeuroFlux introduces an innovative solution to the memory-intensive challenges of training neural networks on small form-factor devices like single-board computers. Traditional machine learning (ML) relies on backpropagation, which requires significant memory to store activations during the forward pass and gradients during the backward pass. This memory demand can make training neural networks infeasible on small devices with limited resources. In contrast, NeuroFlux utilises an adaptive local learning algorithm that eliminates the need for forward-backward dependencies in neural network training. This breakthrough allows the entire process to fit within the memory constraints of small devices, such as GPUs on single-board computers. NeuroFlux is faster and more memory efficient while achieving similar or better accuracy than backpropagation. For more details, you can read the full paper here.

SAFE

Our solution to detecting deepfakes has the smallest footprint of any commercial or academic audio deepfake detection solution currently available, at only 50MB. We can detect fake versus real speech audio in near real-time, with only 0.7sec inference time and at a granularity of 1sec ‘chunks’ of streaming audio. Innovative modular design allows us to customise the model to keep pace with the evolving threat landscape and tailor its interpretability for various use-cases, including human decision support. We obtain outstanding performance across a variety of deepfake generation technologies, including GANs, commercial generation tools, and state-of-the art vocoders. Our solution supports a variety of audio file compression types, bit rates, and audio sample rates.

Machine Learning @The Edge using Tsetlin Machine

Tsetlin Machine (TM) is a novel approach to machine learning that provides a non-arithmetic alternative to the traditional deep neural networks (DNNs). Originally inspired by Mikhail Tsetlin’s learning automata theory [1], it is a logic-driven ML algorithm, invented by Ole-Christoffer Granmo in 2018 [2]. In TMs, the input data are featurised in the form of Boolean literals rather than binary numbers in neural networks (see the stylized illustration below). The inclusion and exclusion of these literals are suitably determined during the training regime through reinforcement learning in finite state machines, otherwise known as Tsetlin automata (TA). A group of TA produces a conjunctive logical expression within a Clause through AND-rules. Two teams of independent clauses participate in a majority voting mechanism to propose or oppose a particular classification task. For multi-class problems, the teams of clauses are organized in multiple parallel paths.

A stylised representation of Tsetlin Machine, illustrated by Dr Jie Lei (formerly with Newcastle Microsystems Group)

The logically driven paradigm in TMs offers three fundamental properties making them different from traditional DNNs. Firstly, the learning process in TMs is organised in a single logic layer with a limited number of hyperparameters. Hence, the training convergence is faster than in DNNs that feature multi-layer regression arithmetic with a significantly higher number of hyperparameters. Secondly, as opposed to the DNNs that provide arithmetically heavy, black-box interfaces between data and decisions, TMs are inherently interpretable. They produce models based on sparse disjunctive normal form, which is comparatively more intelligible to humans [3]. Finally, the logical representations combined with automata-based learning make TMs natively suitable for hardware implementation, yielding low energy footprint and faster throughput [4, 5].

Underpinned by steady research growth over the last 5 years, TMs now support various architectures. These include convolution [6], regression [7], deterministic TMs, weighted clausing, autoencoding, contextual bandit, relational TMs, and multiple-input multiple-output architectures. The independent nature of clause learning allows efficient GPU-based parallelization, providing almost constant-time scaling with a reasonable number of clauses. Several schemes enhance vanilla TM learning and inference, such as clause dropping and focused negative sampling. These advances have enabled many applications: keyword spotting [8], aspect-based sentiment analysis, novelty detection, semantic relation analysis, text categorization, game playing, batteryless sensing [9], recommendation systems and knowledge representation. Recently, TMs have gained traction in the commercial R&D space for empowering low-complexity and explainable machine learning at the edge. Examples include Literal Labs, a spinout from Newcastle University’s Microsystems Group (UK), and Tsense Intelligent Healthcare AS, originating from the University of Agder (Norway).

While TMs have unleashed significant opportunities for empowering low-energy (for example up to 8 nJ per frame using a convolutational TM approach [10]) and low-complexity machine learning applications [4, 8, 9], there are several challenges ahead. Below we discuss major challenges that constitute the future body of TM research:

Booleanization: A TM represents information with Boolean features. This means that the representation is sparse and interpretable. To realize these benefits, however, raw input data must be encoded in Boolean form. The key requirement here is to ensure that the Boolean features can retain the informational values of the raw data. This is an application-specific problem and can present conflicting tradeoffs between TM sizes, accuracy and complexity.
Scalability: With the increased number of Booleans, clauses and classes, TM sizes can grow dramatically. While the TM clauses are modular and self-contained, they are currently organized in one massive clause pool. This can posit resourcing challenges for implementing or simulating TMs.
Logical Transformers: Booleanization requires new representation methods to replace the contemporary real-valued ones. For instance, in natural language processing, real-valued neural network-based word embedding has in some cases been outperformed by logical clauses. While researchers are developing increasingly powerful Booleanization schemes, the lack of labeled data limits scaling. Indeed, further research on self-supervised learning for TMs is needed. One promising avenue is exploring how to generate logical representations using sparse binary hypervectors, jointly encoding images and text.
Development Ecosystem: How to incrementally build large and complex TM systems is an open research problem, involving both knowledge representation as well as scalable software and hardware architectures. An open-source ecosystem that encompasses TM algorithms, architectures, and data is under development at https://github.com/cair/tmu.
Uncertainty quantification: TM learning is a stochastic process, and it looks like the properties of this process can help quantify data, model and output uncertainty.
Causal TMs: Since TM clauses are modular and logical, it is possible to analyze and tailor them for specific purposes, such as capturing cause and effect rather than simple correlations. Initial results from this research activity interconnect Bayesian networks and TMs, opening up research on causal TMs.
Probabilistic knowledge bases: A long-term goal is the development of large-scale knowledge bases, founded on efficient TM learning and inference. In this context, the development of Boolean multimodal representation learning is an active area of research, spanning images, time series, video, natural language, and tabular data.

References:

[1] Tsetlin, Mikhail L’vovich. “Finite automata and models of simple forms of behaviour.” Russian mathematical surveys 18.4 (1963):

[2] Granmo, Ole-Christoffer. “The Tsetlin Machine–A Game Theoretic Bandit Driven Approach to Optimal Pattern Recognition with Propositional Logic.” arXiv preprint arXiv:1804.01508 (2018).

[3] Bhattarai, Bimal, Ole-Christoffer Granmo, and Lei Jiao. “Explainable Tsetlin machine framework for fake news detection with credibility score assessment.” arXiv preprint arXiv:2105.09114 (2021).

[4] Maheshwari, Sidharth, et al. “REDRESS: Generating Compressed Models for Edge Inference Using Tsetlin Machines.” IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).

[5] Wheeldon, Adrian, et al. “Learning automata based energy-efficient AI hardware design for IoT applications.” Philosophical Transactions of the Royal Society A 378.2182 (2020): 20190593.

[6] Granmo, Ole-Christoffer, et al. “The convolutional Tsetlin machine.” arXiv preprint arXiv:1905.09688 (2019).

[7] Darshana Abeyrathna, K., et al. “The regression Tsetlin machine: a novel approach to interpretable nonlinear regression.” Philosophical Transactions of the Royal Society A 378.2164 (2020): 20190165.

[8] Lei, Jie, et al. “Low-power audio keyword spotting using Tsetlin machines.” Journal of Low Power Electr. and Applications 11.2 (2021): 18.

[9] Bakar, Abu, et al. “Logic-based intelligence for batteryless sensors.” Proceedings of the 23rd Annual International Workshop on Mobile Computing Systems and Applications. 2022.

[10] Tunheim, S. A., Zheng, Y., Jiao, L., Shafik, R., Yakovlev, A., & Granmo, O. C. (2025). An All-digital 65-nm Tsetlin Machine Image Classification Accelerator with 8.6 nJ per MNIST Frame at 60.3 k Frames per Second. arXiv preprint arXiv:2501.19347

RT3: Edge Computing for AI

EPSRC National Edge AI Hub

Contact Info

hub@edgeaihub.co.uk

+44 (191) 208 6000

Newsletter