William
Lonon.
I turn data into decisions — machine learning, research, and the tools that put models to work. I study how systems behave, measure what they actually do, and build the pipelines that make the answers usable.
About
I'm a data scientist and developer based in Arkansas. I hold a B.S. in Data Science from the University of Arkansas, where I completed honors research on the behavior and biases of generative AI.
My work sits where analysis meets engineering: framing the question, building the model, measuring it honestly, and shipping the tool that delivers the result. I'm as comfortable in a Jupyter notebook as I am wiring up an API or deploying a site.
I care about rigor and clarity in equal measure — research you can trust, and results you can act on. I'm currently open to data science and analytics roles.
Beyond the numbers, I run an independent music label and build software for the things I care about. The full builder portfolio lives at williamlonon.dev.
Research
Published, peer-mentored research at the intersection of data science and AI safety.
Testing the limits and revealing the biases of generative AI through multimodal feedback loops. I ran GPT-4o and DALL·E in a "telephone game" — chaining image captioning and text-to-image generation over many iterations — and measured how bias compounds across cycles using CLIP similarity and facial-recognition metrics to track semantic drift, identity preservation, and information loss.
- Models hedged away from naming a controversial public figure's real-world associations up to ~1800% more often than on standalone images
- Documented systematic identity drift across regenerations (e.g., Martin Luther King Jr. morphing toward Adolf Hitler)
- Exposed measurable gaps between stated content policies and actual model behavior
Selected Work
Applied data science and machine learning — from audio ML to LLM-driven automation.
A desktop tool that turns any audio file into a playable sampler instrument. It slices a recording at every onset, fingerprints each slice by timbre, and clusters acoustically similar sounds onto the same key — an unsupervised ML pipeline you can play with your keyboard.
- Onset slicing + timbral feature extraction with librosa
- Three clustering modes: KMeans, Agglomerative, and HDBSCAN
- Real-time playback engine with ADSR, filtering, and velocity crossfade
An end-to-end, LLM-powered outreach pipeline built to promote music releases at scale — targeting 9,000+ industry contacts with personalized, genre-aware pitches. From data extraction to automated, customized delivery.
- Playwright scraper handling virtual scrolling and dynamic DOM
- Claude API for personalized generation by genre & contact type
- Full pipeline from extraction to automated delivery
Toolkit
The stack I reach for, from first question to shipped result.
with data.
Open to data science and analytics roles, research collaborations, and consulting. Have a problem worth measuring? Reach out.