publications
publications by categories in reversed chronological order.
2026
- IGLU: The Integrated Gaussian Linear Unit Activation FunctionMingi Kang, Zai Yang, and Jeova Farias Sales Rocha Neto2026
Activation functions are fundamental to deep neural networks, governing gradient flow, optimization stability, and representational capacity. Within historic deep architectures, while ReLU has been the dominant choice for the activation function, modern transformer-based models increasingly are adopting smoother alternatives such as GELU and other self-gated alternatives. Despite their empirical success, the mathematical relationships among these functions and the principles underlying their effectiveness remains only partially understood. We introduce IGLU, a parametric activation function derived as a scale mixture of GELU gates under a half-normal mixing distribution. This derivation yields a closed-form expression whose gating component is exactly the Cauchy CDF, providing a principled one-parameter family that continuously interpolates between identity-like and ReLU-like behavior via a single sharpness parameter σ. Unlike GELU’s Gaussian gate, IGLU’s heavy-tailed Cauchy gate decays polynomially in the negative tail, guaranteeing non-zero gradients for all finite inputs and offering greater robustness to vanishing gradients. We further introduce IGLU-Approx, a computationally efficient rational approximation of IGLU expressed entirely in terms of ReLU operations that eliminates transcendental function evaluation. Through evaluations on CIFAR-10, CIFAR-100, and WikiText-103 across ResNet-20, ViT-Tiny, and GPT-2 Small, IGLU achieves competitive or superior performance on both vision and language datasets against ReLU and GELU baselines, with IGLU-Approx recovering this performance at substantially reduced computational cost. In particular, we show that employing a heavy-tailed gate leads to considerable performance gains in heavily imbalanced classification datasets.
@misc{kang2026igluintegratedgaussianlinear, title = {IGLU: The Integrated Gaussian Linear Unit Activation Function}, author = {Kang, Mingi and Yang, Zai and Neto, Jeova Farias Sales Rocha}, year = {2026}, eprint = {2603.06861}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, url = {https://arxiv.org/abs/2603.06861}, }
2025
- Attention Via Convolutional Nearest NeighborsMingi Kang and Jeová Farias Sales Rocha Neto2025
The shift from Convolutional Neural Networks to Transformers has reshaped computer vision, yet these two architectural families are typically viewed as fundamentally distinct. We argue that convolution and self-attention, despite their apparent differences, can be unified within a single k-nearest neighbor aggregation framework. The critical insight is that both operations are special cases of neighbor selection and aggregation; convolution selects neighbors by spatial proximity, while attention selects by feature similarity, revealing they exist on a continuous spectrum. We introduce Convolutional Nearest Neighbors (ConvNN), a unified framework that formalizes this connection. Crucially, ConvNN serves as a drop-in replacement for convolutional and attention layers, enabling systematic exploration of the intermediate spectrum between these two extremes. We validate the framework’s coherence on CIFAR-10 and CIFAR-100 classification tasks across two complementary architectures: (1) Hybrid branching in VGG improves accuracy on both CIFAR datasets by combining spatial-proximity and feature-similarity selection; and (2) ConvNN in ViT outperforms standard attention and other attention variants on both datasets. Extensive ablations on k values and architectural variants reveal that interpolating along this spectrum provides regularization benefits by balancing local and global receptive fields. Our work provides a unifying framework that dissolves the apparent distinction between convolution and attention, with implications for designing more principled and interpretable vision architectures.
@misc{kang2025attentionconvolutionalnearestneighbors, title = {Attention Via Convolutional Nearest Neighbors}, author = {Kang, Mingi and Neto, Jeová Farias Sales Rocha}, year = {2025}, eprint = {2511.14137}, archiveprefix = {arXiv}, primaryclass = {cs.CV}, url = {https://arxiv.org/abs/2511.14137}, } - Parallel qMRI Reconstruction from 4x Accelerated AcquisitionsMingi Kang2025
Magnetic Resonance Imaging (MRI) acquisitions require extensive scan times, limiting patient throughput and increasing susceptibility to motion artifacts. Accelerated parallel MRI techniques reduce acquisition time by undersampling k-space data, but require robust reconstruction methods to recover high-quality images. Traditional approaches like SENSE require both undersampled k-space data and pre-computed coil sensitivity maps. We propose an end-to-end deep learning framework that jointly estimates coil sensitivity maps and reconstructs images from only undersampled k-space measurements at 4x acceleration. Our two-module architecture consists of a Coil Sensitivity Map (CSM) estimation module and a U-Net-based MRI reconstruction module. We evaluate our method on multi-coil brain MRI data from 10 subjects with 8 echoes each, using 2x SENSE reconstructions as ground truth. Our approach produces visually smoother reconstructions compared to conventional SENSE output, achieving comparable visual quality despite lower PSNR/SSIM metrics. We identify key challenges including spatial misalignment between different acceleration factors and propose future directions for improved reconstruction quality.
@misc{kang2025parallelqmrireconstruction4x, title = {Parallel qMRI Reconstruction from 4x Accelerated Acquisitions}, author = {Kang, Mingi}, year = {2025}, eprint = {2511.18232}, archiveprefix = {arXiv}, primaryclass = {cs.CV}, url = {https://arxiv.org/abs/2511.18232}, }
2024
- Structure and process-level lexical interactions in memory search: A case study of individuals with cochlear implants and normal hearingAbhilasha A. Kumar, Mingi Kang, William G. Kronenberger, and 2 more authorsIn Proceedings of the Annual Meeting of the Cognitive Science Society, 2024
Searching through memory is mediated by complex interactions between the underlying mental lexicon and the processes that operate on this lexicon. However, these interactions are difficult to study due to the effortless manner in which neurotypical individuals perform cognitive tasks. In this work, we examine these interactions within a sample of prelingually deaf individuals with cochlear implants and normal hearing individuals who were administered the verbal fluency task for the "animals" category. Specifically, we tested how different candidates for underlying mental lexicons and processes account for search behavior within the verbal fluency task across the two groups. The models learned semantic representations from different combinations of textual (word2vec) and speech-based (speech2vec) information. The representations were then combined with process models of memory search based on optimal foraging theory that incorporate different lexical sources for transitions within and between clusters of items produced in the fluency task. Our findings show that semantic, word frequency, and phonological information jointly influence search behavior and highlight the delicate balance of different lexical sources that produces successful search outcomes.
@inproceedings{Kumar2024Structure, author = {Kumar, Abhilasha A. and Kang, Mingi and Kronenberger, William G. and Jones, Michael N. and Pisoni, David B.}, title = {{Structure and process-level lexical interactions in memory search: A case study of individuals with cochlear implants and normal hearing}}, booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society}, volume = {46}, year = {2024}, url = {https://escholarship.org/uc/item/7vn9q9hh}, journal = {Proceedings of the Annual Meeting of the Cognitive Science Society}, }