Hello! I am a Computer Science Master's student at the University of Oxford, supervised by Professor Ronald Clark and supported by the Clarendon Fund. I am also working as a visiting researcher in the Scaling Intelligence Lab at Stanford with Professor Azalia Mirhoseini.
I obtained my undergraduate degree at the University of Waterloo studying Software Engineering with a joint major in Combinatorics and Optimization. Previously, I was a Research Scientist intern at NVIDIA’s Toronto AI lab,
Layer 6 AI, and Akasha Imaging.
Demonstrating that increasing the amount of inference compute through repeated sampling leads to large improvements in coverage - the fraction of problems solved by any attempt - across a variety tasks, models, and sample budgets. This makes it possible, and sometimes cost-effective, to amplify weaker models with many samples and outperform single attempts from more capable models.
Introducing an exact, simple (no custom CUDA) implementation of attention that can accelerate LLM throughput by over 30x for problems containing shared prefixes and large batch sizes.
Extending the manifold hypothesis to support natural image data lying on a union of manifolds with varying intrinsic dimension.
Show increased performance in generative modelling and image classification tasks by designing models with an inductive bias for this structure.
Demonstrating that large language models (LLMs) can be misled by providing them
with factually correct, but unrepresentative/biased examples, in the context of
integer-to-integer piecewise functions.
Investigating how the intrinsic dimension of activations in deep neural networks are affected by regularization, correlated with improved validation performance and are coupled with the effects of sudden generalization (grokking).
Proposing a mathematically sound rotation augmentation scheme and loss modification for object detection models that leads to better rotation invariance/equivariance.