Browsing by Subject "Efficient deep learning"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Sparsity prior in efficient deep learning based solvers and models(2022-09-14) Chen, Xiaohan; Wang, Zhangyang; Marculescu, Radu; Vikalo, Haris; Dimakis, Alexandros G.; Yin, WotaoDeep learning has been empirically successful in recent years thanks to the extremely over-parameterized deep models and the data-driven learning with enormous amounts of data. However, deep learning models are especially limited in terms of efficiency, which has two-fold meanings. Firstly, many deep models are designed in a black-box manner, which means these black-box models are unaware of the prior knowledge about the structure of the problems of interest and hence cannot efficiently utilize it. Such unawareness can cause redundancy in parameterization and inferior performance compared to more dedicated methods. Secondly, the extreme over-parameterization itself is inefficient in terms of model storage, memory requirements and computational complexity. This strictly constrains the realistic applications of deep learning on mobile devices with budget resources. Moreover, the financial and environmental costs of training such enormous deep models are unreasonably high, which is exactly the opposite of the call of green AI. In this work, we strive to address the inefficiency of deep models by introducing sparsity as an important prior knowledge to deep learning. Our efforts will be in three sub-directions. In the first direction, we aim at accelerating the solving process for a specific type of optimization problems with sparsity constraints. Instead of designing black-box deep learning models, we derive new parameterizations by absorbing insights from the sparse optimization field, which result in compact deep-learning-based solvers with significantly reduced training costs but superior empirical performance. In the second direction, we introduce sparsity to deep neural networks via weight pruning. Pruning reduces redundancy in over-parameterized deep networks by removing superfluous weights, thus naturally compressing the model storage and computational costs. We aim at pushing pruning to the limit by combining it with other compression techniques for extremely efficient deep models that can be deployed and fine-tuned on edge devices. In the third direction, we investigate what sparsity brings to deep networks. Creating sparsity in deep networks significantly changes the landscape of its loss function and thus behaviors during training. We aim at understanding what these changes are and how we can utilize them to train better sparse neural networks. The main content of this work can be summarized as below. Sparsity Prior in Efficient Deep Solvers. We adopt the algorithm unrolling method to transform classic optimization algorithms into feed-forward deep neural networks that can accelerate convergence by over 100x times. We also provide theoretical guarantees of linear convergence over the newly developed solvers, which is faster than the convergence rate achievable with classic optimization. Meanwhile, the number of parameters to be trained is reduced from millions to tens and even to 3 hyperparameters, decreasing the training time from hours to 6 minutes. Sparsity Prior in Efficient Deep Learning. We investigate compressing deep networks by unifying pruning, quantization and matrix factorization techniques to remove as much redundancy as possible, so that the resulting networks have low inference and/or training costs. The developed methods improve memory/storage efficiency and latency by at least 5x times, varying over data sets and models used. Sparsity Prior in Sparse Neural Networks. We discuss the properties and behaviors of sparse deep networks with the tool of lottery ticket hypothesis (LTH) and dynamic sparse training (DST) and explore their application for efficient training in computer vision, natural language processing and Internet-of-Things (IoT) systems. With our developed sparse neural networks, performance loss is significantly mitigated while by training much fewer parameters, bringing benefits of saving computation costs in general and communication costs specifically for IoT systems.Item Theoretically-grounded efficient deep learning system design(2024-05) Li, Guihong (Ph. D in electrical and computer engineering) ; Marculescu, Radu (Electrical engineer); Pan, David Zhigang; Kim, Hyeji; Wang, Atlas; Ming Lin; Kartikeya BhardwajDeep learning has been revolutionizing human societies across various perspectives. Given its significant potential across various domains, optimizing the efficiency of deep learning techniques has become crucial to enable more affordable and economical model deployments. Efficiency is important at every stage in the life-cycle of a deep learning system, including training, inference, and fine-tuning. In this dissertation, we concentrate on the post-training stages. More specifically, we endeavor to propose systematic approaches that can significantly improve the efficiency of both inference and fine-tuning processes. Moreover, the major contributions of this dissertation are grounded in theoretical principles. This dissertation starts with improving inference efficiency by automatically designing hardware-efficient deep networks via neural architecture search (NAS). Particularly, this dissertation first proposes ZiCo, a novel zero-shot proxy designed to predict test performance without the need for expensive and tedious training. Indeed, by leveraging insights from a rigorous theoretical analysis of gradient properties across different samples, ZiCo surpasses existing proxies consistently across various applications, including image classification and pixel-level prediction. Moreover, the architectures discovered via ZiCo demonstrate comparable performance with traditional NAS methods, but with significantly reduced search time. Besides searching for efficient static networks, this dissertation delves deeper into the domain of anytime inference. In particular, this dissertation presents TIPS, a Markov chain-based framework aimed at efficiently searching for optimal network architectures under diverse hardware constraints. By modeling the training process as a discrete-time Markov chain (DTMC), TIPS identifies important computational paths, thereby enhancing both accuracy and convergence rate of anytime neural networks (AnytimeNNs). Moreover, TIPS outperforms existing AnytimeNNs training methods and achieves better accuracy-FLOPs trade-offs. Additionally, we propose a simple yet effective accuracy predictor based on block cosine similarity and it achieves state-of-the-art performance across multiple networks and datasets. Last but not least, the dissertation also explores the emerging field of efficient fine-tuning. Specifically, we aim to solve the problem on how to efficiently fine-tune a trained model to remove prior information from the training set, i.e., ''machine unlearning.'' This effort is particularly tailored for image-to-image generative models. By introducing a unified framework and a computationally-efficient algorithm, the proposed approach effectively removes information from the "forget" samples while preserving performance on the "retain" samples. Empirical validation on large-scale datasets underscores the algorithm efficacy and compliance with data retention policies, representing a significant advancement in the theoretical and empirical exploration of machine unlearning for generative models. Through these synergistic explorations, this dissertation advances state-of-the-art of designing efficient deep learning systems. By emphasizing both theoretical insights and practical efficiency, this work aims to drive forward the frontier of machine learning research, thus enabling more robust and scalable solutions for real-world applications.