For a visual prosthesis to be useful in daily life, the system relies on image processing to ensure that maximally relevant information is conveyed, e.g. allowing the blind neuroprosthesis user to recognise people and objects. Extraction of the most useful features of a visual scene is a non-trivial task, and the definition of what is ‘useful’ for a user is strongly context-dependent (e.g. navigation, reading, and social interactions are three very different tasks that require different types of information to be conveyed). Despite rapid advancements in deep learning, it is challenging to develop a general, automated preprocessing strategy that is suitable for use in a variety of contexts. In this recent publication, we present a novel deep learning approach that optimizes the phosphene generation process in an end-to-end fashion. In this approach, both the delivery of stimulation to generate phsophene images (phosphene encoding), as well as the interpretation of these phosphene images (phosphene decoding), are modelled using a deep neural network. The proposed model includes a highly adjustable simulation module of prosthetic vision. All components are trained in a single loop, with the goal of finding an optimally interpretable phosphene encoding which can then be decoded to obtain the original input. In computational validation experiments, we show that such an approach is able to automatically find a task-specific stimulation protocol, which can be tailored to specific constraints, such as stimulation on a sparse subset of electrodes. This approach is highly modular and could be used to dynamically optimize prosthetic vision for everyday tasks and to meet the requirements of the end user.
Jaap de Ruyter van Steveninck, Umut Güçlü, Richard van Wezel, Marcel van Gerven. doi: https://doi.org/10.1101/2020.12.19.423601