Site: https://www.comsoc.org/publications/best-readings/machine-learning-communications

Best Readings in Machine Learning in Communications

The field of machine learning (ML) has a long and extremely successful history. For example, the idea of using neural networks (NN) for intelligent machines dates back to as early as 1942 when a simple one-layer model was used to simulate the status of a single neuron. ML has shown its overwhelming advantages in many areas, including computer vision, robotics, and natural language processing, where it is normally difficult to find a concrete mathematical model for feature representation. In those areas, ML has proved to be a powerful tool as it does not require a comprehensive specification of the model. Different from the aforementioned ML applications, the development of communications has vastly relied on theories and models, from information theory to channel modelling. These traditional approaches are showing serious limitations, especially in view of the increased complexity of communication networks. Therefore, research on ML applied to communications, especially to wireless communications, is currently experiencing an incredible boom.

This collection of Best Readings focuses on ML in the physical and medium access control (MAC) layer of communication networks. ML can be used to improve each individual (traditional) component of a communication system, or to jointly optimize the entire transmitter or receiver. Therefore, after introducing some popular textbooks, tutorials, and special issues in this collection, we divide the technical papers into the following six areas:

Signal detection
Channel encoding and decoding
Channel estimation, prediction, and compression
End-to-end communications
Resource allocation
Selected topics

Even if ML in communications is still in its infancy, we believe that a growing number of researchers will be dedicated to the related studies and ML will greatly change the way of communication system design in the near future.

Issued March 2019

Contributors

Geoffrey Ye Li
Professor
Georgia Institute of Technology
Atlanta, Georgia, USA

Jakob Hoydis
Member Technical Staff
Nokia Bell Labs
Paris-Saclay, France

Elisabeth de Carvalho
Professor
Aalborg University
Aalborg, Denmark

Alexios Balatsoukas-Stimming
Postdoctoral Researcher
École Polytechnique Fédérale de Lausanne
Lausanne, Switzerland

Zhijin Qin
Assistant Professor
Queen Mary University of London
London, UK

Editorial Staff

Matthew C. Valenti
Editor-in-Chief, ComSoc Best Readings
West Virginia University
Morgantown, WV, USA

Books

O. Simeone, A Brief Introduction to Machine Learning for Engineers, Foundations and Trends in Signal Processing, 12(3-4), 200-431, 2018.
Targeted specifically at engineers, this book provides a short introduction into key concepts and methods in machine learning (ML). Starting from first principles, it covers a wide range of topics, such as probabilistic models, supervised and unsupervised learning, graphical models, as well as approximate inference. Numerous reproducible numerical examples are provided to help understand the key ideas, while the well-selected and up-to-date list of references provides good entry points for readers willing to deepen their knowledge in a specific area. Overall, the book is an excellent starting point for engineers to familiarize themselves with the broad area of ML.

C. M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
This book not only presents developments in the area of machine learning but also provides a comprehensive introduction to the field. No previous knowledge of pattern recognition or machine learning is assumed, and readers only need to be familiar with multivariate calculus, basic linear algebra, and basic probability theory. It is aimed at graduate students, researchers, and practitioners in the area of machine learning, statistics, computer science, and signal processing.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016.
This is a book on deep learning from some of the pioneers of the field. The book starts with background notions on linear algebra and probability theory. The second part discusses a range of neural network architectures that are most commonly used to solve practical problems and gives guidelines on how to use these architectures through practical examples. Finally, in the third part of the book, the authors discuss a wide range of research-related topics in neural networks.

J. Watt, R. Borthani, and A. K. Katsaggelos, Machine Learning Refined: Foundation, Algorithms, and Applications, Cambridge University Press, 2016.
Written by experts in signal processing and communications, this book contains both a lucid explanation of mathematical foundations in machine learning (ML) as well as the practical real-world applications, such as natural language processing and computer vision. It is a perfect resource and an ideal reference for students and researchers. It is also a useful self-study guide for practitioners working in ML, computer science, and signal processing.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, 1998.
As an introductory book to reinforcement learning (RL), it is one of the main references in the field. It provides a clear and intuitive explanation of the core principles and algorithms in RL, with very useful examples. The recent reedition of the book (August 2018) includes the most recent developments in RL.

Overviews and Tutorials

C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo, “Machine Learning Paradigms for Next-Generation Wireless Networks,” IEEE Wireless Communications, vol. 24, no. 2, pp. 98-105, April 2017.
The article proposes to use machine learning (ML) paradigms to address challenges in the fifth generation (5G) wireless networks. The article first briefly reviews the rudimentary concepts of ML and introduces their compelling applications in 5G networks. With the help of ML, future smart 5G mobile terminals will autonomously access the most meritorious spectral bands with the aid of sophisticated spectral efficiency learning. The transmission protocols in 5G networks can be adaptively adjusted with the aid of quality of service learning/inference. The article assists the readers in refining the motivation, problem formulation, and methodology of powerful ML algorithms in the context of future wireless networks.

T. O'Shea and J. Hoydis, “An Introduction to Deep Learning for the Physical Layer,” IEEE Transactions on Cognitive Communications and Networking, vol. 3, no. 4, pp. 563-575, December 2017.
This paper presents the idea of learning full physical-layer implementations of communication systems with the help of neural network-based autoencoders. The technique is evaluated through simulations in several simple scenarios, such as the AWGN, Rayleigh fading, and two-user interference channels, where state-of-the-art performance is achieved. The paper discusses several situations when and reasons why deep learning can lead to gains with respect to classical model-based approaches, presents some examples showing how expert knowledge can be injected into the neural network architecture, and outlines a list of future research challenges.

L. Liang, H. Ye, and G. Y. Li, “Toward Intelligent Vehicular Networks: A Machine Learning Framework,” IEEE Internet of Things Journal, vol. 6, no. 1, pp. 124-15, February 2019.
This article provides an extensive overview on how to use machine learning to address the pressing challenges of high-mobility vehicular networks. Through learning the underlying dynamics of a vehicular network, better decisions can be made to optimize network performance. In particular, the article discusses employing reinforcement learning to manage the network resources as a promising alternative to prevalent optimization approaches.

M. Ibnkahla, “Applications of Neural Networks to Digital Communications - A Survey,” Elsevier Signal Processing, no. 80, pp. 1185-1215, July 2000.
This classical survey paper provides an excellent overview of research, mostly carried out in the 1990s, on various applications of neural networks to communication systems. It is a good link between past research and future trends in machine learning in communications.

K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep Reinforcement Learning: A Brief Survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26-38, November 2017.
This survey first introduces the principles of deep reinforcement learning (RL) and then presents the main streams of value-based and policy-based methods. It covers all important algorithms in deep RL, including the deep Q-network, trust region policy optimization, and asynchronous advantage actor critic. At the end of the article, several current research areas in the field of deep RL are introduced.

Special Issues

“Machine Learning for Cognition in Radio Communications and Radar,” IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 1, pp. 3-247, February 2018.

“Machine Learning and Data Analytics for Optical Communications and Networking,” IEEE/OSA Journal of Optical Communications and Networking, vol. 10, no. 10 October 2018.

Technical Papers

Signal detection
Channel encoding and decoding
Channel estimation, prediction, and compression
End-to-end communications
Resource allocation
Selected topics

Papers - Topic: Signal Detection

H. Ye, G. Y. Li, and B.-H. Juang, “Power of Deep Learning for Channel Estimation and Signal Detection in OFDM Systems,” IEEE Wireless Communications Letters, vol. 7, no. 1, pp. 114-117, February 2018.
This paper proposes a deep learning based joint channel estimation and signal detection approach. A deep neural network is trained to recover the transmit data by feeding the received signals corresponding to transmit data and pilots. This method outperforms the minimum mean-squared error method for a system without adequate pilots or cyclic prefix and with nonlinear distortions.

N. Samuel, T. Diskin, and A. Wiesel, “Deep MIMO Detection,” in Proc. IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Sapporo, Japan, July 2017.
The paper uses deep learning for massive multi-input multi-output (MIMO) detection by unfolding a projected gradient descent method, and applies the approach to time-invariant and time-varying channels. The deep learning algorithm provides lower complexity than approximate message passing and semidefinite relaxation with the same accuracy and enhanced robustness.

N. Farsad and A. Goldsmith, “Neural Network Detection of Data Sequences in Communication Systems,” IEEE Transactions on Signal Processing, vol. 66, no. 21, pp. 5663-5678, November 2018.
This paper describes a bidirectional recurrent neural network for sequence detection in channels with memory. The method does not require knowledge of the channel model. Alternatively, if the channel model is known, it does not require knowledge of the channel state information (CSI). Simulation and experimental results show that the developed method works well and can outperform Viterbi detection in certain scenarios.

Papers - Topic: Channel Encoding and Decoding

N. Farsad, M. Rao, and A. Goldsmith, “Deep Learning for Joint Source-Channel Coding of Text,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, April 2018.
This paper addresses joint source and channel coding of structured data, such as natural language, over a noisy channel. The typical approach to this problem is optimal in terms of minimizing end-to-end distortion only when both the source and channel have arbitrarily long block lengths, which is not necessarily optimal for finite-length documents or encoders. This paper demonstrates that, in this scenario, a deep learning based encoder and decoder can achieve lower word-error rates.

T. Gruber, S. Cammerer, J. Hoydis, and S. ten Brink, “On Deep Learning-based Channel Decoding,” in Proc. Information Sciences and Systems (CISS), Baltimore, USA, March 2017.
The authors of this paper use neural networks to learn decoders for random and structured codes, such as polar codes. The key observations are that (i) optimal bit-error rate performance for both code families and short codeword lengths can be achieved, (ii) structured codes are easier to learn, and (iii) the neural network is able to generalize codewords that it has never seen during training for the structured codes, but not for the random codes. Scaling to long codewords is identified as the main challenge for neural network-based decoding due to the curse of dimensionality.

E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Burshtein, and Y. Be’ery, “Deep Learning Methods for Improved Decoding of Linear Codes,” IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 1, pp. 119-131, February 2018.
The paper applies deep learning to the decoding of linear block codes with short to moderate block length based on a recurrent neural network architecture. The methods show advantages in complexity and performance in the belief propagation and min-sum algorithms.

Papers - Topic: Channel Estimation, Prediction, and Compression

H. He, C.-K. Wen, S. Jin, and G. Y. Li, “Deep Learning-Based Channel Estimation for Beamspace mmWave Massive MIMO Systems,” IEEE Wireless Communications Letters, vol. 7, no. 5, pp. 852-855, October 2018.
This paper develops a deep learning-based channel estimation network for beam-space millimeter-wave massive multi-input multi-output (MIMO) systems. A neural network is used to learn the channel structure and estimate the channel from a large amount of training data. It provides an analytical framework on the asymptotic performance of the channel estimator. Results demonstrate that that the neural network significantly outperforms state-of-the-art compressed sensing-based algorithms even when the receiver is equipped with a small number of RF chains.

C.-K. Wen, W.-T. Shih, and S. Jin, “Deep Learning for Massive MIMO CSI Feedback,” IEEE Wireless Communications Letters, vol. 7, no. 5, pp. 748-751, October 2018.
This article develops a novel channel state information (CSI) sensing and recovery mechanism using deep learning. The new approach learns to exploit channel structure effectively from training samples and transforms CSI to a near-optimal number of representations/codewords. The proposed approach can recover CSI with significantly improved reconstruction performance compared to the existing compressive sensing (CS)-based methods, even at excessively low compression regions where the traditional CS-based methods fail.

Y. Wang, M. Narasimha, and R. W. Heath, Jr., “MmWave Beam Prediction with Situational Awareness: A Machine Learning Approach,” in Proc. IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Kalamata, Greece, June 2018.
This article combines machine learning tools and situational awareness to learn the beam information, e.g., received power and the optimal beam index, in millimeter-wave communication systems. It uses the vehicle locations as features to predict the received power of any beam in the beam codebook and shows that situational awareness can largely improve the prediction accuracy. The method requires almost no overhead and can achieve high throughput with only a small performance degradation.

D. Neumann, T. Wiese, and W. Utschick, “Learning the MMSE Channel Estimator,” IEEE Transactions on Signal Processing, vol. 11, no. 66, pp. 2905-2917, June 2018.
This paper addresses the problem of estimating Gaussian random vectors with random covariance matrices. The authors develop a neural network architecture inspired from expert knowledge about the covariance matrix structure, which achieves state-of-the-art performance with an order-of-magnitude less complexity. This is a good example of how expert knowledge can be combined with machine learning methods to outperform purely model-based approaches.

Papers - Topic: End-to-End Communications

S. Dörner, S. Cammerer, J. Hoydis, and S. ten Brink, “Deep Learning-Based Communication over the Air,” IEEE Journal Selected Topics in Signal Processing, vol. 12, no. 1, pp. 132-143, February 2018.
This paper reports the world’s first implementation of a fully neural network-based communication system using software-defined radios. The authors identify the “missing channel gradient” as the biggest obstacle in training such systems over actual channels and propose a workaround based on model-based training in simulations followed by receiver-finetuning on measured data. Their implementation comes close to, but does not outperform, a well-designed baseline. A special neural network structure for the task of synchronization to single-carrier waveforms is introduced.

H. Ye, G. Y. Li, B.-H. Juang, and K. Sivanesan, “Channel Agnostic End-to-End Learning Based Communication Systems with Conditional GAN,” in Proc. IEEE Global Communication Conference (GLOBECOM) Workshops, Abu Dhabi, UAE, December 2018.
This paper employs a conditional generative adversarial net (GAN) to build an end-to-end communication system without a channel model, where deep neural networks (DNNs) represent both the transmitter and the receiver. The conditional GAN learns to generate the channel effects and acts as a bridge for the gradients to pass through in order to jointly train and optimize both the transmitter and the receiver DNNs.

F. Ait Aoudia and J. Hoydis, “End-to-End Learning of Communications Systems without a Channel Model,” in Proc. IEEE Asilomar Conference on Signal, System, Computers, Pacific Grove, USA, October 2018.
The authors provide a solution to the problem of training autoencoder-based communication systems over actual channels without any channel model. The key idea is to estimate the channel gradient using the technique of policy gradients from the field of reinforcement learning. They show through simulations that this approach works as well as model-based learning for the AWGN and Rayleigh channels.

B. Karanov, M. Chagnon, F. Thouin, T. A. Eriksson, H. Bülow, D. Lavery, P. Bayvel, and L. Schmalen, “End-to-End Deep Learning of Optical Fiber Communications,” Journal of Lightwave Technology, vol. 36, no. 20, pp. 4843-4855, October 2018.
This paper uses an autoencoder to learn transmitter and receiver neural networks for use in optical communications. The experimental results show that the autoencoder can effectively learn to deal with nonlinearities and that it provides good performance for different link dispersions.

Papers - Topic: Resource Allocation

H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos, “Learning to Optimize: Training Deep Neural Networks for Interference Management,” IEEE Transactions on Signal Processing, vol. 66, no. 20, pp. 5438-5453, October 2018.
This paper exploits deep neural networks (DNNs) to address optimization and interference management issues. The input and output of a signal-processing (SP) algorithm is treated as an unknown nonlinear mapping and is approximated by a DNN. It demonstrates that SP tasks can be performed effectively for those optimization problems that can be learned accurately by a DNN of moderate size. The paper also identifies a class of optimization algorithms that can be solved by a moderate size DNN and then uses interference management algorithm as an example to demonstrate the effectiveness of the proposed approach

V. Va, J. Choi, T. Shimizu, G. Bansal, and R. W. Heath, Jr., “Inverse Multipath Fingerprinting for Millimeter Wave V2I Beam Alignment,” IEEE Transactions on Vehicular Technology, vol. 67, no. 5, pp. 4042-4058, May 2018.
This paper uses multipath fingerprinting to address the beam alignment problem in millimeter wave vehicle-to-infrastructure communications. Based on the vehicle's position (e.g., available via GPS), the multipath fingerprint/signature is first obtained from a database and provides prior knowledge of potential pointing directions for reliable beam alignment, which can be regarded as the inverse of fingerprinting localization. From the extensive simulation results, the proposed approach provides increasing rates with larger antenna arrays while IEEE 802.11ad has decreasing rates due to the higher beam training overhead.

U. Challita, L. Dong, and W. Saad, “Proactive Resource Management for LTE in Unlicensed Spectrum: A Deep Learning Perspective,” IEEE Transactions on Wireless Communications, vol. 17, no. 7, pp. 4674-4689, July 2018.
This paper develops a deep learning-based resource allocation framework for coexistence of long term evolution (LTE) networks with licensed assisted access (LTE-LAA) and WiFi in the unlicensed spectrum. With long short-term memory, each small-cell base station is able to decide on its spectrum allocation autonomously by requiring only limited information on the network state.

R. Daniels and R. W. Heath, Jr., “An Online Learning Framework for Link Adaptation in Wireless Networks,” in Proc. Information Theory and Applications Workshop (ITA), San Diego, USA, February 2009.
This paper is one of the earliest works on machine learning (ML) for link adaptation. The motivation for using ML lies in the difficulty to model the impairments in wireless communications (nonlinearities, interference). It uses real-time measurements to build and continuously adapt a classification procedure based on k-nearest neighbor. Follow-up work relies on support vector machines (SVMs).

Papers - Topic: Selected Topics

T. J. O’Shea, T. Roy, and T. C. Clancy, “Over-the-Air Deep Learning Based Radio Signal Classification,” IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 1, pp. 168-179, February 2018.
In this paper, the widely studied problem of modulation classification is revisited using a neural network operating on raw IQ samples. The authors demonstrate that neural networks can outperform the best-known alternative methods based on expert features for several realistic datasets obtained from over-the-air measurements and simulations.

X. Wang, L. Gao, S. Mao, and S. Pandey, “CSI-Based Fingerprinting for Indoor Localization: A Deep Learning Approach,” IEEE Transactions on Vehicular Technology, vol. 66, no. 1, pp. 763-776, January 2017.
The paper presents an indoor localization method based on deep learning (DL). The DL algorithm exploits channel state information in the frequency domain (amplitude and phase) from three distant antennas for an indoor OFDM system. Location is mapped to fingerprints that are the optimal weights of the deep learning network. Training and testing are performed with experimental data.

M. Kim, N.-I. Kim, W. Lee, and D.-H. Cho, “Deep Learning-Aided SCMA,” IEEE Communications Letters, vol. 22, no. 4, pp. 720-723, April 2018.
This article uses machine learning to design sparse code multiple access (SCMA) schemes. Deep neural networks (DNNs) are used to adaptively construct a codebook and learn a decoding strategy to minimize bit-error rate for SCMA. The proposed deep learning based SCMA can provide improved spectral efficiency and massive connectivity, which is a promising technique for 5G wireless communication systems.

C. Studer, S. Medjkouh, E. Gönültaş, T. Goldstein, and O. Tirkkonen, “Channel Charting: Locating Users within the Radio Environment Using Channel State Information,” IEEE Access, vol. 6. pp. 47682-47698, August 2018.
This paper uses passively collected channel state information (CSI) in conjunction with autoencoder neural networks and other machine learning methods in order to perform relative localization of users within a cell. Extensive simulation results demonstrate that the various proposed methods are able to successfully preserve the local geometry of users and that the autoencoder performs particularly well.

J. Vieira, E. Leitinger, M. Sarajlic, X. Li, and F. Tufvesson, “Deep Convolutional Neural Networks for Massive MIMO Fingerprint-Based Positioning,” in Proc. IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Montreal, Canada, October 2017.
The paper exploits the sparsity of the channel in the angular domain in a massive multi-input multi-output (MIMO) system to build a map between user position and channel angular pattern. To learn the mapping, the paper uses a convolutional neural network that is trained using measured and simulated channels.