publications
publications by categories in reversed chronological order.
2025
- Attention Via Convolutional Nearest NeighborsMingi Kang and Jeová Farias Sales Rocha Neto2025
The shift from Convolutional Neural Networks to Transformers has reshaped computer vision, yet these two architectural families are typically viewed as fundamentally distinct. We argue that convolution and self-attention, despite their apparent differences, can be unified within a single k-nearest neighbor aggregation framework. The critical insight is that both operations are special cases of neighbor selection and aggregation; convolution selects neighbors by spatial proximity, while attention selects by feature similarity, revealing they exist on a continuous spectrum. We introduce Convolutional Nearest Neighbors (ConvNN), a unified framework that formalizes this connection. Crucially, ConvNN serves as a drop-in replacement for convolutional and attention layers, enabling systematic exploration of the intermediate spectrum between these two extremes. We validate the framework’s coherence on CIFAR-10 and CIFAR-100 classification tasks across two complementary architectures: (1) Hybrid branching in VGG improves accuracy on both CIFAR datasets by combining spatial-proximity and feature-similarity selection; and (2) ConvNN in ViT outperforms standard attention and other attention variants on both datasets. Extensive ablations on k values and architectural variants reveal that interpolating along this spectrum provides regularization benefits by balancing local and global receptive fields. Our work provides a unifying framework that dissolves the apparent distinction between convolution and attention, with implications for designing more principled and interpretable vision architectures.
@misc{kang2025attentionconvolutionalnearestneighbors, title = {Attention Via Convolutional Nearest Neighbors}, author = {Kang, Mingi and Neto, Jeová Farias Sales Rocha}, year = {2025}, eprint = {2511.14137}, archiveprefix = {arXiv}, primaryclass = {cs.CV}, url = {https://arxiv.org/abs/2511.14137}, } - Parallel qMRI Reconstruction from 4x Accelerated AcquisitionsMingi Kang2025
Magnetic Resonance Imaging (MRI) acquisitions require extensive scan times, limiting patient throughput and increasing susceptibility to motion artifacts. Accelerated parallel MRI techniques reduce acquisition time by undersampling k-space data, but require robust reconstruction methods to recover high-quality images. Traditional approaches like SENSE require both undersampled k-space data and pre-computed coil sensitivity maps. We propose an end-to-end deep learning framework that jointly estimates coil sensitivity maps and reconstructs images from only undersampled k-space measurements at 4x acceleration. Our two-module architecture consists of a Coil Sensitivity Map (CSM) estimation module and a U-Net-based MRI reconstruction module. We evaluate our method on multi-coil brain MRI data from 10 subjects with 8 echoes each, using 2x SENSE reconstructions as ground truth. Our approach produces visually smoother reconstructions compared to conventional SENSE output, achieving comparable visual quality despite lower PSNR/SSIM metrics. We identify key challenges including spatial misalignment between different acceleration factors and propose future directions for improved reconstruction quality.
@misc{kang2025parallelqmrireconstruction4x, title = {Parallel qMRI Reconstruction from 4x Accelerated Acquisitions}, author = {Kang, Mingi}, year = {2025}, eprint = {2511.18232}, archiveprefix = {arXiv}, primaryclass = {cs.CV}, url = {https://arxiv.org/abs/2511.18232}, }
2024
- Structure and process-level lexical interactions in memory search: A case study of individuals with cochlear implants and normal hearingAbhilasha A. Kumar, Mingi Kang, William G. Kronenberger, and 2 more authorsIn Proceedings of the Annual Meeting of the Cognitive Science Society, 2024
Searching through memory is mediated by complex interactions between the underlying mental lexicon and the processes that operate on this lexicon. However, these interactions are difficult to study due to the effortless manner in which neurotypical individuals perform cognitive tasks. In this work, we examine these interactions within a sample of prelingually deaf individuals with cochlear implants and normal hearing individuals who were administered the verbal fluency task for the "animals" category. Specifically, we tested how different candidates for underlying mental lexicons and processes account for search behavior within the verbal fluency task across the two groups. The models learned semantic representations from different combinations of textual (word2vec) and speech-based (speech2vec) information. The representations were then combined with process models of memory search based on optimal foraging theory that incorporate different lexical sources for transitions within and between clusters of items produced in the fluency task. Our findings show that semantic, word frequency, and phonological information jointly influence search behavior and highlight the delicate balance of different lexical sources that produces successful search outcomes.
@inproceedings{Kumar2024Structure, author = {Kumar, Abhilasha A. and Kang, Mingi and Kronenberger, William G. and Jones, Michael N. and Pisoni, David B.}, title = {{Structure and process-level lexical interactions in memory search: A case study of individuals with cochlear implants and normal hearing}}, booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society}, volume = {46}, year = {2024}, url = {https://escholarship.org/uc/item/7vn9q9hh}, journal = {Proceedings of the Annual Meeting of the Cognitive Science Society}, }