Master Thesis thinking
I am halfway through my Master's of Science degree and I have started working on my proposal and written research component of my work. For the better part of a year and a half prior to this, I was working on an Augmented Reality task for participants to undergo that would allow them to complete a complex series of cognitive-motor tasks which may hopefully reconnect parts of their brains that have been injured due to concussion or other mild traumatic brain injury-like diseases.
I also had a plan, or better described as a feeling, of secondary more hard science work that this project could accomplish; this post is an effort to put this goal into words that make some sort of sense.
This whole idea somewhat comes from two particular sources, the first being Krishnamurti's "Freedom from the Known," where presence is postulated as the best way to live, and that often we don't treat reality as an instantaneous, moment-by-moment case, but rather are stuck interacting with other layers of consciousness in a way that influences our current world view and thoughts. So to me, I took away the idea that our neurological processes overlap in some way and that meditation and many mindfulness practices in general aim to "calm" this cross-speak.
The next piece of learning I experienced was a machine learning course by Andrew Ng where I learned about the basics of Deep Learning, both vision and transformers. One of the more surprising takeaways for me was that with an expansive series of arrays and dot products, we can simulate complex calculations and learning through loss functions. It blew my mind that with a handful of mathematical functions at scale, we can create very complex statistical outputs which can be derived from "training" on data, be it images, text, or any other data.
I think it was at this point that I started to consider the possibility of combining both of these ideas, both meditation and machine learning, into one hard-to-describe idea. Then this became further messy in my mind once I decided to read "The Computational Neurobiology of Reaching and Pointing - A Foundation for Motor Learning" by Reza Shadmehr and Steven P. Wise. This wonderful book helped me refresh my understanding of motor control and motor learning and started to touch on the ideas of probabilistic thinking regarding motor control. Whether this idea has ever been made explicit in this text, I am not sure, but I started to consider whether many parts of our brain are simply conducting many instantaneous dot product operations with very large arrays, some of which are more pooled and others that are more precise. I started to wonder whether it would be possible to decouple these based on motor output if we could also somewhat track what the inputs are as well; although we could not track a person's live inherent thoughts, maybe we could piece apart the component parts of these shared integrative pools through large data collections of humans doing complex tasks that require many of these pools at once.
Phew. I think that's the broad idea that I have been trying to set up. Overall, the approach I hope to achieve as a secondary goal in my research is this: to train a Deep Learning model on human performance in an Augmented Reality task as they hopefully transition from experiencing many symptoms and problematic behaviors to fewer, wherein their psychological, integrative, and cognitive-motor functions improve.
Drawing out the basic science research here, I think that the visual inputs of the surroundings, as well as the sounds and the task goal, interspersed with the user's quirks like how they see themselves, how they perceive the likelihood of achievement of a score or goal or target, along with the motor outputs of their eyes using gaze tracking, body movements, and speech outputs like mumbling and the like, could be analyzed.
The hypothesis here would be that we could create a predictive GAN or transformer model based on neuroanatomical streams that allow us to better understand which integrative pools use which information at what moment during a task, and how external factors influence the motor output in real-time.
At least in theory, I think this could be somewhat possible; it would require lots of work but it's possible.
If you're reading this and are thinking, like me, this is much too impossible for a Master's degree you would be right! The goal of this idea would be to either pursue it full-time in the future as a PhD candidate or as a Research Associate at some point. I think that this kind of thing is not something typically done partly because collecting human participant data is hard to begin with and Augmented Reality is still an extremely young field, and while 2 years ago tracking body pose would have only been possible with movie quality or fancy set ups like optitrack, today the same thing can be accomplished with a handful of external plug in cameras! In addition to this Augmented Reality is not rigorously used due to Virtual Reality taking most of the market share and easily causing nausea when the user moves/interacts with the world.