A guide for installing, configuring and operating the Nemesida AI MLC machine learning module, designed for building behavioral models, as well as detecting brute force, flood and DDoS attacks.

Installation

Not used in Nemesida WAF Free.

Before installing Nemesida WAF components, add repository information to the system:

DebianUbuntuCentOSDockerVirtual Appliance
# apt install apt-transport-https gnupg2
Debian 9
# echo "deb https://nemesida-security.com/repo/nw/debian stretch non-free" > /etc/apt/sources.list.d/NemesidaWAF.list
Debian 10
# echo "deb https://nemesida-security.com/repo/nw/debian buster non-free" > /etc/apt/sources.list.d/NemesidaWAF.list
Debian 11
# echo "deb https://nemesida-security.com/repo/nw/debian bullseye non-free" > /etc/apt/sources.list.d/NemesidaWAF.list
# wget -O- https://nemesida-security.com/repo/nw/gpg.key | apt-key add -
# apt update && apt upgrade
# apt install apt-transport-https gnupg2
Ubuntu 18.04
# echo "deb [arch=amd64] https://nemesida-security.com/repo/nw/ubuntu bionic non-free" > /etc/apt/sources.list.d/NemesidaWAF.list
# wget -O- https://nemesida-security.com/repo/nw/gpg.key | apt-key add -
# apt update && apt upgrade
Ubuntu 20.04
# echo "deb [arch=amd64] https://nemesida-security.com/repo/nw/ubuntu focal non-free" > /etc/apt/sources.list.d/NemesidaWAF.list
# wget -O- https://nemesida-security.com/repo/nw/gpg.key | apt-key add -
# apt update && apt upgrade
Ubuntu 22.04
# echo "deb [arch=amd64] https://nemesida-security.com/repo/nw/ubuntu jammy non-free" > /etc/apt/sources.list.d/NemesidaWAF.list
# curl -s https://nemesida-security.com/repo/nw/gpg.key | gpg --no-default-keyring --keyring gnupg-ring:/etc/apt/trusted.gpg.d/trusted.gpg --import
# chmod 644 /etc/apt/trusted.gpg.d/trusted.gpg 
# apt update && apt upgrade
CentOS 7
# rpm -Uvh https://nemesida-security.com/repo/nw/centos/nwaf-release-centos-7-1-6.noarch.rpm
# yum update
# yum install epel-release
CentOS 8 Stream
# rpm -Uvh https://nemesida-security.com/repo/nw/centos/nwaf-release-centos-8-1-6.noarch.rpm
# dnf update
# dnf install epel-release
CentOS 9 Stream
# rpm -Uvh https://nemesida-security.com/repo/nw/centos/nwaf-release-centos-9-1-6.noarch.rpm
# dnf update
# dnf install epel-release
Information about using Nemesida AI MLC in a Docker container is available in corresponding section.
Information about using Nemesida WAF in the form of Virtual Appliance (virtual disk for KVM/VMware/VirtualBox) and Yandex VM is available in corresponding section.

The Nemesida AI module consists of Nemesida AI MLA modules (is included in the installation package of the Nemesida WAF module) and Nemesida AI MLC, whose interaction is possible in normal mode (modules operate on the same server) and mode “dot-multipoint” (the Nemesida AI MLC module operates on a dedicated server).

Python pip packages
For machine learning modules to work correctly, it is necessary to use unified versions of Python3 pip packages on servers with Nemesida AI MLA and Nemesida AI MLC installed.

Installation

DebianUbuntuCentOS
Debian 9
# apt install python3 python3-venv python3-pip python3-dev python3-setuptools libc6-dev rabbitmq-server gcc memcached
Debian 10
# apt install python3 python3-venv python3-pip python3-dev python3-setuptools libc6-dev rabbitmq-server gcc memcached
Debian 11
# apt install python3 python3-venv python3-pip python3-dev python3-setuptools libc6-dev rabbitmq-server gcc memcached

Install Nemesida AI MLC:

# apt install nwaf-mlc
Ubuntu 18.04
# apt install python3 python3-venv python3-pip python3-dev python3-setuptools libc6-dev rabbitmq-server gcc memcached
Ubuntu 20.04
# apt install python3 python3-venv python3-pip python3-dev python3-setuptools libc6-dev rabbitmq-server gcc memcached
Ubuntu 22.04
# apt install python3 python3-venv python3-pip python3-dev python3-setuptools libc6-dev rabbitmq-server gcc memcached

Install Nemesida AI MLC:

# apt install nwaf-mlc
CentOS 7
# yum install gcc rabbitmq-server python36 python36-devel python36-setuptools python36-pip memcached
# yum install nwaf-mlc
CentOS 8 Stream
Add the RabbitMQ repository by bringing the file /etc/yum.repos.d/RabbitMQ.repo to the form:

[rabbitmq_erlang]
name = rabbitmq_erlang
baseurl = https://packagecloud.io/rabbitmq/erlang/el/8/$basearch
repo_gpgcheck = 0
gpgcheck = 0
enabled = 1

[rabbitmq_server]
name = rabbitmq_server
baseurl = https://packagecloud.io/rabbitmq/rabbitmq-server/el/8/$basearch
repo_gpgcheck = 0
gpgcheck = 0
enabled = 1

Install the package:

# dnf update
# dnf install rabbitmq-server

Check the correctness of the service:

# systemctl enable rabbitmq-server
# service rabbitmq-server restart
# service rabbitmq-server status

Install Nemesida AI MLC:

# dnf install gcc python39 python39-devel python39-setuptools python39-pip memcached
# dnf install nwaf-mlc
CentOS 9 Stream
Install the packages:

# dnf install dnf-utils
# dnf install centos-release-rabbitmq-38
# dnf install rabbitmq-server

Check the correctness of the service:

# systemctl enable rabbitmq-server
# service rabbitmq-server restart
# service rabbitmq-server status

Install Nemesida AI MLC:

# dnf install gcc python3 python3-devel python3-setuptools python3-pip memcached
# dnf install nwaf-mlc

Initial setup

After installing the module, it is necessary to make the initial configuration by specifying the following parameters:

mlc.conf parameters
Default parameter
Description of the parameter

[Nemesida AI MLC]
The section responsible for the general settings of the Nemesida AI MLC module. To configure the module, make the necessary changes to the main configuration file /opt/mlc/mlc.conf
nwaf_license_key
Installing the Nemesida WAF license key when working on a dedicated server. When used on the same server with the Nemesida WAF module or when working in the Multipoint Mode mode, you do not need to use the parameter.

Usage example:

nwaf_license_key = 1234567890
api_uri
The Nemesida WAF API address for sending information about the training status of models and information about detected anomalies. If the parameter value is empty, no information will be sent.
sys_proxy
Configuring the proxy server address to access nw-auth-extra.nemesida-security.com:443 (license key verification) and nemesida-security.com:443 (loading a list of virtual hosts, loading and unloading behavioral models).

Example:

sys_proxy=http://proxy.example.com:3128
api_proxy
Configuring the proxy server address to access the Nemesida WAF API and Nemesida WAF Signtest.

Example:

api_proxy=http://proxy.example.com:3128

If the parameters have no values, the module will try to use the parameters from the file nwaf.conf.

st_enable
Sending disputed requests received from the Nemesida WAF module using RabbitMQ to the Nemesida WAF Signtest server for subsequent processing.

Disputed requests are defined as follows:
– if the signature analysis has determined the request as illegitimate, and the Nemesida AI MLC module has determined it as legitimate;
– if the signature analysis determined the request as legitimate, and the Nemesida AI MLC module determined it as illegitimate.

st_uri
The address of the Nemesida WAF Signtest server for sending disputed requests.

If the Nemesida WAF API and Nemesida WAF Signtest modules are not configured yet, then the parameters api_uri, api_proxy, st_enable, st_uri can be specified later.

After making changes, restart the server or restart the service and check its operation:

# service mlc_main restart
# service mlc_main status

Configuring

The cloud settings management functionality is enabled by default, but it can be disabled by sending a request to email.

In order to avoid errors when configuring the module, we recommend using a cloud WebApp.

Manage settings using a cloud WebApp and cloud API

Information about configuring Nemesida AI MLC using cloud WebApp and cloud API available in the relevant sections.

Manage settings using configuration files

Nemesida AI MLC

To configure the module, make the necessary changes to the configuration file /opt/mlc/mlc.conf. The /opt/mlc/mlc-example.conf file contains a complete list of available parameters.

mlc.conf parameters
Default parameter
Description of the parameter

[main]
The section responsible for the general settings of the Nemesida AI MLC module.
vhosts_list

A list of domain names used as virtual hosts for which behavioral models need to be created. Only one model is applied per request.

Wildcard values can be used. Only one model can be applied to one request. The model is applied in the following order of priority from highest to lowest:

1. vhosts_list = example.com – building and applying a model for a specific domain;
2. .example.com – building and applying a model for the domain example.com and its subdomains;
3. *.example.com – building and applying the model only for subdomains *.example.com , excluding the main domain example.com ;
4. * – building and applying the model for all other domains.

Usage example:

vhosts_list = .example.com b.example.com

– building and applying models for the listed virtual hosts

Simultaneous use of is not allowed example.com and .example.com. If you need to use one model per domain and another for its subdomains, use example.com and *.example.com.

Example:

vhosts_list = example.com *.example.com

When applying behavioral models to a request, the domain level is taken into account, that is, the model corresponding to the virtual host from the request will be applied, and in case of its absence, the model that includes this domain.

Thus, for a request with the domain name b.example.com the model will be applied b.example.com, if there is no such – the model .example.com, if there isn’t one, then the model is *.example.com.

The training period of the model can be changed by specifying the required number of days before the domain name: x:example.com, where x is the training period in days.

Usage example:
5:example.com — the training of the model will last 5 days.

nwaf_license_key
Installing the Nemesida WAF license key when working on a dedicated server. When used on the same server with the Nemesida WAF module or when working in the Multipoint Mode mode, you do not need to use the parameter.

Usage example:

nwaf_license_key = 1234567890
ai_extra

Activation/deactivation of the additional request analysis functionality, which allows detecting missed attacks and temporarily blocking their source by IP address. If the additional analysis functionality is inactive, all unblocked requests will be included in the training sample (with the exception of requests that fall under the WL mode, or illegitimate requests that fall under the LM mode).

api_uri
The Nemesida WAF API address for sending information about the training status of models and information about detected anomalies. If the parameter value is empty, no information will be sent.

[run]
The section responsible for connection parameters with the local RabbitMQ service.
rmq_host
Connection parameters with the RabbitMQ service.

It is allowed to use multiple values separated by a space.

Example:

rmq_host = guest:guest@192.168.0.1 guest:guest@192.168.0.2

It is allowed to use a secure connection:

rmq_host = ssl://guest:guest@example.ru:5673

To use an arbitrary port, it must be specified, otherwise the standard port 5672 will be used.

Before using a secure connection, it must be configured on each server with the Nemesida WAF dynamic module installed.

rmq_host_local
Connection parameters with the RabbitMQ service for local queue placement.

Example:

rmq_host_local = guest:guest@127.0.0.1

If the parameter is omitted, the following values will be used: guest:guest@127.0.0.1.


[proxy]
The section responsible for configuring the connection to the proxy server.
sys_proxy

Configuring the proxy server address to access nw-auth-extra.nemesida-security.com:443 (license key verification) and nemesida-security.com:443 (loading a list of virtual hosts, loading and unloading behavioral models).

Example:

sys_proxy=http://proxy.example.com:3128
api_proxy
Configuring the proxy server address to access the Nemesida WAF API and Nemesida WAF Signtest.

Example:

api_proxy=http://proxy.example.com:3128

If the parameters have no values, the module will try to use the parameters from the file nwaf.conf.


[ddos]
The section responsible for the operation of the functionality for detecting denial of service attacks (DDoS attacks) at the application level.
enable
Activation/deactivation of the functionality.
wl_ip
A parameter that defines the path to the file where the IP addresses are specified in the 1 format.2.3.4, for which the functionality will be disabled. Each subsequent value is separated by a space.

Changes in files are applied automatically, without restarting Nemesida AI MLC.

wl_url
A parameter that defines the path to the file in which addresses are specified both in the format vhost and in the format vhost/path, where:

vhost – the name of the virtual host for which the DDoS detection functionality will be disabled.
path – the occurrence of the resource address.

Strict matching and wildcard values are allowed for a virtual host.

Example:

example.com/feed
.example.com/feed
*.example.com/feed
*/feed

Changes in files are applied automatically, without restarting Nemesida AI MLC.

interval
The time interval of the segment (window) during which the query analysis is performed.
latest_only
Activation of transmission to Nemesida WAF API of only the last blocked request for each IP address. If is false, all blocked requests for each IP address are transmitted to the Nemesida WAF API.
send_possible
Activation of the transmission mechanism in Nemesida WAF API requests with the type Possible DDoS. If is false, requests to the Nemesida WAF API will not be transmitted.

The prefix Possible is added to the name of the attack if its type has not been reliably established.


[brute]
The section responsible for the operation of the flood detection functionality (flood attack) and brute force attacks (brute force attack). The detection of the enumeration of values is performed in the zones ARGS and/or BODY.
enable
Activation/deactivation of the functionality.
wl_host
Deactivation of functionality for specific virtual hosts. It is allowed to use strict matching and wildcard values: example.com , .example.com , *.example.com .

Example:

wl_host = example.com .example.org *.example.us

Parameter changes are applied automatically, without restarting Nemesida AI MLC.

interval
The time interval of the segment (window) during which the query analysis is performed.
max_val
The number of requests, when the value of which is reached, the source(s) of the attack are blocked.
brute_detect
A parameter that defines the path to the file in which the system user sets addresses for detecting attacks by brute force in the format vhost/path, where path is the occurrence of the resource address on the web server. It is allowed to use strict matching and wildcard values: example.com , .example.com , *.example.com .

Example:

example.com/auth
.example.com/auth
*.example.com/auth
*/auth

Thus, with the set value example.com/auth , brute force attacks will be monitored as for example.com/auth , and for example.com/auth/reset_password .

The parameter is used to detect brute force attacks, but does not block duplicate requests with the same content in the ARGS or BODY zones.

For addresses not specified in the file, brute force attacks are not detected. Changes in files are applied automatically, without restarting Nemesida AI MLC.

flood_detect
The parameter has a similar functionality to the brute_detect parameter, but is designed to detect flood attempts or similar attacks with repeated requests. The only difference is that during the analysis of requests that fall under the action of the flood_detect parameter, duplicates are not deleted.

Thus, in case of repeated sending of identical requests (for example, multiple attempts to recover a password by SMS), requests with similar content and falling under the action of the flood_detect parameter will not be deleted, unlike requests with similar content, but falling under the action of the parameterbrute_detect.

For addresses not specified in the file, flood attempts are not detected. Changes in files are applied automatically, without restarting Nemesida AI MLC.

latest_only
Activation of transmission to Nemesida WAF API of only the last blocked request for each IP address. If is false, all blocked requests for each IP address are transmitted to the Nemesida WAF API.
send_possible
Activation of the transmission mechanism in Nemesida WAF API requests with the type Possible Brute force/Possible Flood. If is false, requests to the Nemesida WAF API will not be transmitted.

The prefix Possible is added to the name of the attack if its type has not been reliably established.


[st]
The section responsible for interacting with the Nemesida WAF Signtest learning management module.
st_enable
Sending disputed requests received from the Nemesida WAF module using RabbitMQ to the Nemesida WAF Signtest server for subsequent processing.

Disputed requests are defined as follows:
– if the signature analysis has determined the request as illegitimate, and the Nemesida AI MLC module has determined it as legitimate;
– if the signature analysis determined the request as legitimate, and the Nemesida AI MLC module determined it as illegitimate.

st_uri
The address of the Nemesida WAF Signtest server for sending disputed requests.

[mls]
The section responsible for transmitting traffic to a remote server for construction of behavioral models. To use this functionality, contact the service technical support.
mls_enable
Activation of the mechanism for transmitting the analyzed traffic to the Nemesida WAF MLS server. By default, the functionality is deactivated.

[training]
Learning process management section.
dataset_limit
Sets the maximum number of unique queries included in the training sample.

After making changes, restart the server or restart the service and check its operation:

# service mlc_main restart
# service mlc_main status

Additional modes of operation of the Nemesida AI MLC module

Working in Multipoint Mode

To build behavioral models, the Nemesida AI MLC module requires a significant amount of free RAM. When using more than one server with the Nemesida WAF module, you can save hardware resources by using the point-to-multipoint operation scheme (one server with the Nemesida AI MLC module installed interacts with many servers with Nemesida WAF modules installed).

On a server with the Nemesida WAF module installed

– Create a user of the RabbitMQ service:

# rabbitmqctl add_user USER PASSWORD
# rabbitmqctl set_permissions -p / USER ".*" ".*" ".*"

where USER and PASSWORD are the username and password for connecting the Nemesida AI MLC module.

– Make changes to the configuration file /etc/rabbitmq/rabbitmq-env.conf:

NODE_PORT=5672
export RABBITMQ_NODENAME=rabbit@localhost
export RABBITMQ_NODE_IP_ADDRESS=0.0.0.0
export ERL_EPMD_ADDRESS=127.0.0.1

– Allow access from the server on which the Nemesida AI MLC module is installed to the RabbitMQ port (by default 5672 TCP).
– Complete the RabbitMQ setup:

# service rabbitmq-server restart

On a server with the Nemesida AI MLC module installed

Create additional configuration files in the /opt/mlc/conf/ directory by copying the /opt/mlc/mlc.conf file. Make changes to the new configuration files to work with the remote RabbitMQ server. After making the changes, restart the service:

# service mlc_main restart
# service mlc_main status

In additional configuration files nwaf_license_key is a required parameter. The license key used in the Nemesida AI MLC settings and the remote Nemesida WAFs must have the same WAF ID. When using additional configuration files, it is recommended to delete the /opt/mlc/mlc.conf file.

Using remote RabbitMQ services, the Nemesida AI MLC module will collect queries and then train models in the same way as in normal operation.

Working with the Nemesida AI MLS cloud server

The Nemesida AI cloud server is designed to generate behavioral models based on a copy of traffic coming from remote servers. The cloud server is used in cases when the Nemesida WAF software user does not have enough RAM for the Nemesida AI MLC module to work. To use the capabilities of the Nemesida AI cloud server, contact the service technical support.

Behavioral model management
Behavioral models can also be managed using the cloud web application and cloud API.

Accuracy of behavioral models

During the training period, in order to build better models, it is not recommended to scan the web application for vulnerabilities, as well as send other illegitimate requests. Immediately after the first training, it is recommended to retrain the models. False alarms are controlled using the module Nemesida WAF Signtest.

Storage of behavioral models

Behavioral models created by the Nemesida AI MLC module are transmitted to the remote Nemesida AI MLS server and automatically distributed to all running instances of Nemesida AI MLA and Nemesida AI MLC in accordance with the WAF ID.

Retraining of Nemesida AI models

To improve the accuracy of detecting attacks, it is recommended to retrain models once a week. To do this, you can use the cloud WebApp, cloud API or add to virtual value host symbol ^ when using configuration files.

Example:

vhosts_list = example.com^

After making the changes, restart the service:

# service mlc_main restart

After retraining the models, it is recommended to delete the exported BT 12 requests (the requests are contained on the False Positive page available on the web interface of the module Nemesida WAF Signtest). When training models for a virtual host, BT 12 requests will be included in the training sample and will not be required further.

To retrain behavioral models when using cloud services, you need to use them to set a list of virtual hosts.

Increasing the learning time of Nemesida AI behavioral models

The correct construction of models requires about 400,000-800,000 unique requests. By default, the training period is 4 days. The training period of the model can be changed directly by specifying the list of domain names in the vhosts_list parameter in mlc.conf: x:example.com , where x is the training period in days. For example, 5:example.com – the training of the model will last 5 days.

After making the changes, restart the service:

# service mlc_main restart

Additional training of models using a backup copy of the training sample

The correct construction of models requires about 400,000-800,000 unique requests. If the number of requests was insufficient during the training, then you can restart it and use the requests from the previous sample. To do this, follow these steps:

1. Stop the Nemesida AI MLC service:

# service mlc_main stop

2. Move the file /opt/mlc/ml/backup/[vhost].d_[timestamp], where [timestamp] is the date of creation of a backup copy of the training sample created by Nemesida AI MLC before starting the model construction, in /opt/mlc/ml/[vhost].d. For example, for the model example.com :

# mv /opt/mlc/ml/backup/example.com.d_1613587613 /opt/mlc/ml/example.com.d

3. Start the training. To do this, add the symbol ^ to the value of the virtual host.

Example:

vhosts_list = example.com^

Launch the Nemesida AI MLC service:

# service mlc_main start

After the end of the training period (the period can be changed), a behavioral model will be created based on queries from the general sample.

Removing Nemesida AI models

In case of incorrect training of behavioral models or significant changes in the web application that lead to a lot of false positives, it is recommended to delete the models. To do this, send a request using the settings management functionality using cloud API or use cloud WebApp manage settings.

Example of deleting a model using the cloud API:

# curl 'https://nemesida-security.com/nw/ml/mgmt/del_models_uri' --data 'key=1234567890&vhost=example.com'

where:

  • key – license key or its SHA256 hash;
  • vhost – the name of the virtual host for which you want to delete the model.

Training sample for building behavioral models

Queries defined as BT 1, BT 2, BT 3 and BT 4 are not added to the training sample, even if they fall under the LM mode.

When the ai_extra mode is enabled, queries defined by the Nemesida AI MLC module as illegitimate are not added to the training sample.