Welcome!

ITU invites you to participate in the ITU Artificial Intelligence/Machine Learning in 5G Challenge, a competition that is scheduled to run from now until the end of the year. Participation in the Challenge is free of charge and open to all interested parties from countries that are members of ITU.
Detailed information about it can be found on the Challenge website , which includes the document “ITU AI/ML 5G Challenge: Participation Guidelines”.


The Bonch-Bruevich St.Petersburg State University of Telecommunications(SPbSUT) is glad to announce below the "Traffic recognition and Long-term traffic forecasting based on AI algorithms and metadata for 5G/IMT-2020 and beyond", which is organized as part of the “ITU Artificial Intelligence/Machine Learning in 5G Challenge.”


Registration

Closed (Deadline - September, 5 September, 11)

Please, use the following link to know how to register for this challenge:
Step-by-Step Instruction
Please, use the following link to register for this challenge
Register (Closed)


Overview

The advent of 5G is introducing new challenges for mobile communications service providers and integrating artificial intelligence (AI) techniques into networks is one way the industry is addressing these complexities.

The 5G/IMT-2020 network will require robust smart algorithms to adapt network protocols and resource management for different services in different scenarios. Recently, developments in deep learning, convolutional neural networks, and reinforcement learning hold important promise for the solution of very complex problems considered intractable until now.

As we know, according to the International Telecommunication Union recommendation ITU-R M.2083-0 IMT vision - “Framework and overall objectives of the future development of IMT-2020 and beyond”, infrastructure will be based on Software-Defined Networking (SDN) and Network Function Virtualization (NFV) for providing new quality level and service control possibility.

In general, a significant number of available Internet services and applications require the exact value of network parameters such as latency, jitter, RTT, and bandwidth. The SDN-based technologies should be able to control and manage dynamic QoS for different new services, which are a time constraint.

A precisely the prediction and recognition of the future 5G network traffic will help us design greener traffic-aware networks. Second, traffic prediction is required to efficiently use network resources. Accurate prediction of network traffic at access points enables efficient resource allocation to ensure good quality of service. Also, data analysis techniques can be leveraged to find out specific patterns that can help to recognize device types.

To increase the quality of communications, processes automation, it is required to implement AI technologies to 5G networks for traffic monitoring and dynamic traffic management.


Problem statement

The goal of this challenge is to create a solution based on AI/ML techniques such as deep learning that estimates performance based on the prediction and recognition of Metadata traffic flows. This research problem focuses on issues of integrating AI algorithms with 5G/IMT-2020 network (SDN/NFV) and based on the independence from the hardware solutions.

Baseline

The key features of the proposal is to use the metadata of flows (fig 1.1, 1.2) on the data plane at the same time the analytical application with AI/ML algorithms is located on the service level and working with the SDN/NFV network via northbound APIs.

Logo
Fig. 1.1 Structure of simple OpenFlow table with metadata of flows

The circled data, in figure 1.1., is the data that should be used to form the meta-model for the network flow. These data includes two main counters; Byte Count and Packet Count. In addition to these counters, flow table contains another main parameter, which is the Time Stamp. Time Stamp enables the instantaneous calculation of ByteCount-delta and PacketCount-delta. One of the main features of these data is that, based on the “Byte Count” and “Packet Count” counters, it is impossible to accurately determine the exact packet length in the packet stream. Accordingly, based on these data it is impossible to accurately determine the length of each packet registered in the stream during a time interval ΔT [1, 2].
However, for an arbitrary period of time ΔT, having samples of [Byte Count], [Packet Count], [Time Stamp] values, it is possible to create a data set with an established data structure, where each sample displays the instantaneous value [ByteCount-delta] and [PacketCount-delta]. The values of [Byte Count], [Packet Count] and [Time Stamp] are generated by instantaneous requests received by the SDN controller, via the RESTful application programming interface (API) [1, 2].

Logo
Fig.1.2 Data preparing

We present a machine learning-based approach to recognize and predict the SDN network traffic analyzing metadata of streams of packets sent and received. We built an experimental network to generate network traffic data (IoT and Video). The general architecture of the suggestion shows on the following fig 1.3.

Logo
Fig.1.3. The General architecture

Direction 1: AI for traffic recognition and classification

Machine Learning models for traffic recognition based on Metadata (data set) of flows.

Direction 2: AI for Long-term traffic forecasting

Long-term traffic forecasting on the data plane of recognized traffic based on the Metadata (data set).


The key features of the proposal is to use the metadata of flows on the data plane at the same time the analytical application with AI/ML algorithms is located on the service level and working with the SDN/NFV network via northbound API.

Task

Based on the the proposed method make the following suggestions:

  • Suggestion with ML model for traffic recognition based on metadata (published Data Sets);
  • ML model for the following Long-term traffic forecasting (flows);
  • Suggestion with both 1st and 2nd algorithms (theoretical);
  • The Output Format

    The output format is the report* (expected) which include the following:

  • Problem analysis include the Gap analysis of current approaches for solve defined research problem (~2 pages);
  • Architectural scheme, models, algorithm in UML notation (~1 page);
  • Description of solution/suggestion (~1 page);
  • Results of modeling in the graphs and their explanation (~ 1-2 pages);
  • Source software with ML and Big data (if necessary) algorithms;
  • Trained ML-models;
  • results in the CSV file, which contains results of training: necessary parameters (find in the evaluation clause).
  • *the “.docx” format is required for report.

    Data Set

    Structure of the training DataSet_ML, can be defined as following:

    formula
    Data set describes the two flows on the data plane: IoT and Video.

    Evaluation

    Task 1:

    For the evaluation of the recognition accuracy, we used the probability of recognition in percentage. In addition, use the confusion matrix of the training neural network as the second parameter of NN performance.

    Task 2:

    For the evaluation of prediction accuracy, we used the Mean Absolute Percentage Error (MAPE) eq.1 and Root mean square error (RMSE) eq.2 for evaluating the prediction accuracies.

    formula
    formula
    Where N is the total number of observations, y_t is the actual value, whereas y^_t is the predicted value.

    Solutions with lower MAPE and RMSE score for Task 2 and high probability of recognition in Task 1 will be the winners.



    Final submissions and winners

    After the score-based evaluation phase (see "Evaluation" section), a ranking of all the teams will be published.

    Then, top 5 teams must send to the organizers: (1) the code of the ML solution proposed, (2) the model already trained, and (3) a report.

    Top 3 solutions of this challenge will have access to the next level of the ITU AI/ML in 5G Challenge , which will award the 3 best solutions among all the challenges proposed under this initiative. See more details at this ITU document of guidelines for participants.

    Resources

    We summarize below the main resources provided for this challenge:
  • Summary slides (Link: Slides )
  • Training datasets (Link: DataSets )
  • Test datasets (Link: DataSets )
  • Contact and updates

    Please, contact us, if you will have questions on the challenge task. Mailing list (use the three addresses at the same time, for example, one is the main and others put in the copy):
  • artemanv.work@gmail.com
  • ammarexpress@gmail.com
  • alirefaee@azhar.edu.eg
  • Organizers and Experts


    Artem N. Volkov

    PhD Student, Researcher
    artemanv.work@gmail.com

    Ali Refaee Abdellah

    PhD Student, Researcher
    alirefaee@azhar.edu.eg

    Dr.Vasiliy S. Elagin

    Head of the PhD depatment
    elagin.vas@gmail.com

    Dr.Ammar Muthanna

    Head of SDN laboratory, PhD
    ammarexpress@gmail.com