Blog > Metaheuristic Optimization on Python

Metaheuristic Optimization on Python

Exploring Simulated Annealing and Particle Swarm Optimization algorithms for solving optimization problems across various test functions in Python

Published: March 1, 2021

Last updated: March 18, 2024

Abstract

Optimization is the process of obtaining the most optimal solution for a problem within specified constraints and objectives. The evolving world has brought along with it optimization problems. Metaheuristic approach algorithms are algorithms that aim to find the optimum point of an objective varying within a discrete value range through heuristic approaches. It cannot be expected that an optimization algorithm will be successful on every problem or test function. Therefore, the heuristic algorithm to be used should be preferred based on the objective. In this study, Simulated Annealing (SA) and Particle Swarm Optimization (PSO) algorithms are discussed. Investigations were conducted on Ackley, Beale, Goldstein-Price, and Levi test functions using the Python programming language.

1. Introduction

Optimization is the process of finding the optimal solution point in the solution space of a problem under defined constraints. Nowadays, both mathematical and heuristic techniques are employed in solving optimization problems. Mathematical techniques, as they scan the entire solution space during problem-solving, tend to yield more costly results when the solution space is vast. In such cases, it is more advantageous to utilize heuristic algorithms that navigate the solution space intuitively, reaching solutions in a shorter time (Çelik, 2013). Metaheuristic algorithms are solution methods obtained by adapting basic heuristic techniques to specific problems. The term "metaheuristic" is derived from the combination of two Greek words: "meta" (alongside, beyond) and "heuristic" (finding). These algorithms efficiently navigate the solution space using high-level working environments and effective search operations to reach the optimal solution faster (Çelik, 2013). One of the most significant advantages of metaheuristic optimization algorithms is their ability to reach a global solution without getting trapped in local optimal points (Laporte, 2006). Metaheuristic algorithms can be applied to various integrated problems and possess flexible structures (Çelik, 2013). In other words, metaheuristic algorithms can be seen as a general algorithmic structure that can be adapted to specific problems with a few modifications, while also being applicable to different optimization problems. The classification of metaheuristic algorithms is shown in Figure 1.1.

Figure 1.1. Classification of Metaheuristic Algorithms (Wikipedia, 2012)

2. Metaheuristic Algorithms

2.1. Simulated Annealing

The foundation of the simulated annealing analogy is based on the work of Metropolis et al. (1953). In this study, an algorithm mimicking the thermal equilibrium of a solid at a specific temperature level was developed. Subsequently, Kirkpatrick et al. (1983) proposed an algorithm based on the work of Metropolis et al. (1953) that could be utilized in combinatorial optimization problems (Akbulut, 2020). Simulated annealing (SA) is a stochastic approach based on simulating the statistical process of growing crystals using annealing processes to reach the general minimum internal energy configuration. In simpler terms, it mimics the physical annealing process that enhances the quality of a solid material by first heating it to a certain temperature and then cooling it. In this approach, a solid material like metal is heated to a high temperature and melted, allowing atoms to move freely. However, the movements of atoms are restricted as the temperature is lowered. As the temperature decreases, atoms tend to move less, and crystal formations adopt the lowest possible internal energy. The formation of crystals is primarily related to the rate of cooling. When the cooling rate of molten metals is too fast, it may not achieve a crystalline state because the temperature decreases rapidly. Instead, it may reach a polycrystalline state with higher energy compared to the crystal state. In engineering applications, defects occur within the material after rapid cooling. Therefore, to reach the lowest energy state (internal energy), the heated solid (molten metal) must be slowly and controllably cooled. This slow cooling process is referred to as annealing (Koç, 2019).

In simulated annealing, the control of the cooling process, or in other words, adjusting the rate at which it occurs, represents the probability of accepting solutions that, although not better during the search for the optimal solution, contribute to the acceptability of solutions that may lead to the overall best solution. At the beginning of the search, this probability should be set to a sufficiently high value to accept inferior solutions and thus avoid local optimal solutions in favor of finding the overall best solution. As the search for the best solution progresses, the probability decreases, making it difficult to escape from local optimal solutions. The objective here is to attempt reaching the overall best solution rather than transitioning to a new local optimal solution by searching through the neighbors of the current local optimal solution (Koç, 2019).

The probability of selecting a poor solution systematically decreases with temperature. The main objective of the SA algorithm is to leave no unexplored regions in the solution space. Simulated Annealing can be considered as a useful method providing near-optimal solutions for combinatorial optimization problems. SA has been applied to solve many combinatorial optimization problems such as the traveling salesman problem, scheduling, quadratic assignment problem, and network design (Sarıkaya, 2014).

To utilize the SA algorithm in solving any problem, certain parameters need to be determined. These parameters include the initial temperature (T0), the number of iterations at each temperature, the cooling function, and the stopping criterion for the algorithm. The initial temperature is an input parameter used to control the acceptance probability of poor solutions. The number of iterations represents the number of solutions generated at each temperature. The cooling function determines the temperature at the current iteration based on the temperature of the previous iteration. Together with the initial temperature, the number of iterations, and the cooling function, they form the cooling schedule. This schedule significantly influences solution quality or convergence rate. The SA algorithm stops when the solution obtained at each temperature change does not vary significantly over several consecutive temperature changes (Sarıkaya, 2014).

Güner and Altıparmak (2003) have expressed the relationship between physical annealing and combinatorial optimization as shown in Table 2.1 (Akbulut, 2020).

Thermodynamic Simulation	Combinatorial Optimization
System States	Suitable Solutions
Energy	Objective Function
State Change	Neighbor Solution
Temperature	Control Parameter
Freezing State	Heuristic Solution

Table 2.1. Relationship between physical annealing and combinatorial optimization

2.1.1 Methodology

The SA algorithm starts with a randomly generated initial solution and generates the next solution at each iteration with local neighborhood. The new solution, which improves (reduces the energy of the material/system) the objective function representing the system's energy level, is always accepted. On the other hand, temporary solution suggestions that allow increasing the system's temperature or deviating/deteriorating from the objective function of the system to a certain extent are also accepted (Akbulut, 2020). The basic operation of the algorithm is represented in Figure 2.1.

Figure 2.1. The Basic Operation of Simulated Annealing Algorithm (Koç, 2020)

The energy difference (∆E) between the current and the newly generated solution represents the difference between the objective function of the current solution and the objective function of the newly generated solution through the random neighborhood of the current solution. The acceptance state of the new solution is based on the Boltzmann distribution, as shown in Equation 2.1.

P(E) = e^{(-\Delta E/{kB}*T)}

Equation 2.1. Boltzmann distribution

If the randomly generated number is smaller than the Boltzmann distribution, the new solution is accepted; otherwise, if the randomly generated number is not smaller than the Boltzmann distribution, the new solution is rejected, and the iteration continues with the old solution. The pseudo-code of the SA algorithm is given in Figure 2.2, and the flowchart is given in Figure 2.3.


- Choose initial solution
- Choose initial temperature (𝑇0 > 0)
- Set temperature change counter
- Repeat
    - Set iteration number
    - Repeat
        - Generate a new neighboring solution (Neighbor search algorithm)
        - Calculate objective function change (∆ = abs(f(Neighbor Solution) - f(Solution)))
        - If (f(Neighbor Solution) < f(Solution))
            - Select the new solution
        - Else
            - If 𝑟𝑎𝑛𝑑[0,1) < e(-∆E/kB*T)
                - Select the new solution
        - Increment iteration number
    - Until iteration number reaches N (Temperature Change)
- Increase temperature change
- Update temperature
- Repeat until iteration is completed

Figure 2.2. Pseudo-code of Simulated Annealing

Figure 2.3. Simulated Annealing Flowchart (Şefik Temel, Mustafa Ö. Cingiz, Oya Kalıpsız)

2.1.2 Results

The SA algorithm, expressed in the flowchart in Figure 2.3, was implemented in the Python language, and Ackley, Beale, Goldstein-Price, and Levi Test functions were performed. The results obtained from the tests are presented below in order.

The parameters and values used for all tests are given in Table 2.2.

Parameter	Value
startedLocation	[0.5, -0.5]
numberOfIterations	[50, 500]
temperatureValues	700
fraction	0.88
functionToOptimize	// Test function
functionRange	// Test function range

Table 2.2. Parameters used in the SA Test functions

The graph of values generated with the Ackley test function in SA is given in Figure 2.4.

The value range and expected best values for the Ackley test function are given in Table 2.3.

Parameter	Value
Value range	X => [-5, 5], Y => [-5, 5]
Expected best solution values	X = 0, Y = 0
Expected optimum value	0
-----------------------------	-------------------------
Obtained solution value(X)	-0.00119562669897455
Obtained solution value(Y)	-0.0008832548326439538
Obtained optimum value	0.004263276791252935

Table 2.3. Values obtained from the Ackley test function

The Beale test function was applied using the SA algorithm. The graph of the values generated with the Beale test function in SA is given in Figure 2.5.

The value range and expected best values for the Beale test function are given in Table 2.4.

Parameter	Value
Value range	X => [-4.5, 4.5], Y => [-4.5, 4.5]
Expected best solution values	X = 3, Y = 0.5
Expected optimum value	0
-----------------------------	-------------------------
Obtained solution value(X)	3.00448000488404
Obtained solution value(Y)	0.5025383356664512
Obtained optimum value	5.0725312871377724e-05

Table 2.4. Values obtained from the Beale test function

The Goldstein-Price test function was applied using the SA algorithm. The graph of the values generated with the Goldstein-Price test function in SA is given in Figure 2.6.

Figure 2.6. Goldstein-Price test function graph

The value range and expected best values for the Goldstein-Price test function are given in Table 2.5.

Parameter	Value
Value range	X => [-2, 2], Y => [-2, 2]
Expected best solution values	X = 0, Y = -1
Expected optimum value	3
-----------------------------	-------------------------
Obtained solution value(X)	0.004099428038386865
Obtained solution value(Y)	-1.0001308106302074
Obtained optimum value	3.00436862126945

Table 2.5. Values obtained from the Goldstein-Price test function

The Levi test function was applied using the SA algorithm. The graph of the values generated with the Levi test function in SA is given in Figure 2.7.

The value range and expected best values for the Levi test function are given in Table 2.6.

Parameter	Value
Value range	X => [-10, 10], Y => [-10, 10]
Expected best solution values	X = 1, Y = 1
Expected optimum value	0
-----------------------------	-------------------------
Obtained solution value(X)	1.008089079742875
Obtained solution value(Y)	0.9719072291301988
Obtained optimum value	0.006684399680206016

Table 2.6. Values obtained from the Levi test function

As a result, it has been observed that the produced algorithm closely approximates the expected values.

2.2 PARTICLE SWARM OPTIMIZATION

Particle Swarm Optimization (PSO) is a metaheuristic algorithm developed by observing the social behaviors of bird and fish flocks. It was proposed by Eberhart and Kennedy in 1995. PSO is also referred to as bird swarm optimization (Çetin, 2013).

In this algorithm, bird and fish flocks search a specific area to find food or shelter. PSO consists of the social behaviors of these flocks. The first behavior is the tendency of each particle in the flock to go to the best position from its past memories. The second behavior is to follow the particle closest to the food within the flock. The last behavior is the past velocity values that enable the particle to explore a wide area. These behaviors constitute the basis of PSO (Çetin, 2013).

2.2.1 METHODOLOGY

The PSO algorithm is a population-based metaheuristic algorithm. It starts with a population containing random solutions and tries to provide a global optimum response by updating at each iteration. Each bird in the swarm represents a solution. At the same time, each bird produces a response in the dimension it moves. Each of the given responses represents the current position of the bird. This position is referred to as $p_{best}$ . The best of the $p_{best}$ values in the algorithm is called $g_{best}$ , and this value is the optimum value sought.

For example, the velocities and positions of S particles moving in a D-dimensional search space can be expressed as follows. Here, the X position matrix is given by Equation 2.2, and the V velocity matrix is given by Equation 2.3.

X = \left[ \begin{array}{cccc} X_{11} & X_{12} & \cdots & X_{1D} \\ X_{21} & X_{22} & \cdots & X_{2D} \\ \vdots & \vdots & \ddots & \vdots \\ X_{S1} & X_{S2} & \cdots & X_{SD} \\ \end{array} \right]

Equation 2.2

V = \left[ \begin{array}{cccc} V_{11} & V_{12} & \cdots & V_{1D} \\ V_{21} & V_{22} & \cdots & V_{2D} \\ \vdots & \vdots & \ddots & \vdots \\ V_{S1} & V_{S2} & \cdots & V_{SD} \\ \end{array} \right]

Equation 2.3

In Equations 2.2 and 2.3, the position of the i-th particle is expressed as $X_i = [X_{i1} \ X_{i1} \ \cdots \ X_{iD}]$ , and the velocity of the i-th particle is expressed as $V_i = [V_{i1} \ V_{i1} \ \cdots \ V_{iD}]$ matrix. The best position of the particle (local best, $p_{best}$ ) is given by the matrix $p_{best}$ , as shown in Equation 2.4.

p_{best} = \left[ \begin{array}{cccc} pbest_{11} & pbest_{12} & \cdots & pbest_{1D} \\ pbest_{21} & pbest_{22} & \cdots & pbest_{2D} \\ \vdots & \vdots & \ddots & \vdots \\ pbest_{S1} & pbest_{S2} & \cdots & pbest_{SD} \\ \end{array} \right]

Equation 2.4

The $p_{best}$ matrix holds the best position found by S particles in D dimensions up to the current time. The best position where the i-th particle is located is expressed as $p_{best,i} = [pbest_{i1} \ pbest_{i2} \ \cdots \ pbest_{iD}]$ matrix.

The global best position where the swarm is located, $g_{best}$ , represents the best position in the $p_{best}$ matrix. Equation 2.5 gives the $g_{best}$ matrix.

g_{best} = [gbest_{11} \ gbest_{12} \ \cdots \ gbest_{1D}]

Equation 2.5

PSO conceptually relies on determining the velocities of particles in each generation based on their own local best positions and the global best position of the swarm. During the evolutionary process, the velocity and position of each particle are updated using Equations 2.6 and 2.7 (Gözde et al., 2008).

v_{i,d}^{(t+1)} = w \cdot v_{i,d}^t + c_1 \cdot r_1 \cdot (pbest_{i,d} - x_{i,d}^t) + c_2 \cdot r_2 \cdot (gbest_d - x_{i,d}^t)

Equation 2.6

x_{i,d}^{(t+1)} = x_{i,d}^t + v_{i,d}^{(t+1)} \\

Equation 2.7

i = 1, 2, …, S

In Equation 2.6, c1 and c2 are positive constants that generally vary depending on the problem in the range of [0.2, 2]. A constant coefficient can be given according to the problem. In some problems, dynamic assignments can be made based on values like $p_{best}$ due to the inclusion of particle memories. The numbers $r_1$ and $r_2$ are randomly generated coefficients. They introduce randomness to the response given to the problem in each iteration. The numbers $r_1$ and $r_2$ are generally generated in the range [0,1). W is the inertia weight and usually varies between 0.1 and 1 (Çetin, 2013). In PSO, the inertia weight W is used to balance global and local search capabilities. A large inertia moment facilitates global search, while a small inertia moment facilitates local search. Thus, the inertia moment balances the local and global exploration and aims to reach the result with the least number of iterations. Each particle here benefits not only from the best particle in the flock but also from the experiences of all other particles in the flock (Tamer & Karakuzu, 2006).

The linear decrease of W is given by Equation 2.8 (Kennedy & Eberhart, 1995).

w = w_{\text{max}} - \text{iteration} \times \frac{w_{\text{max}} - w_{\text{min}}}{\text{iteration}_{max}}

Equation 2.8

PSO particles continuously change their positions throughout the iteration period in the multi-dimensional search space. Changes in the search space are given in Figure 2.8.

Figure 2.8. PSO parameters represented as a vector (Hamed Hosseini, Mehdi Shahbazian, Mohammad Ali Takassi)

The parameters on Figure 2.8 are as follows:

$x^k$ : current location
$x^{k+1}$ : new location
$v^k$ : current velocity
$v^{k+1}$ : new velocity
$v^{pbest}$ : velocity calculated with $p_{best}$
$v^{gbest}$ : velocity calculated with $g_{best}$
$p_{best}$ : local best location
$g_{best}$ : global best location

The pseudo-code for the PSO algorithm is given in Figure 2.9.


- Initialize random starting swarm
    - for t=1: maximum iterations
        - Calculate w(omega)
        - for i=1: number of particles
            - for d=1: dimension
                - vi,d(t+1) = w*vi,d(t)+c1*r1*(pi-xi,d(t))+c2*r2*(pg-xi,d(t))
                - xi,d(t+1) = xi,d(t)+vi,d(t+1)
            - Calculate new objective value
        - if f(xi,d(t)) < f(pi(t)) then
            - pi(t) = xi,d(t)
            - f(pg(t)) =f(pi(t))
            - if f(xi,d(t)) < f(gi(t)) then
                - gi(t) = xi,d(t)
                - f(gg(t)) =f(pi(t))
            end
        end
    end
end

Figure 2.9. Pseudo-code for Particle Swarm Optimization

The flowchart for the PSO algorithm is given in Figure 2.10.

Figure 2.10. Particle Swarm Optimization Flowchart

2.2.2 Results

The PSO algorithm, as expressed in the flowchart in Figure 2.10, was implemented in the Python language. Test functions including Ackley, Beale, Goldstein-Price, and Levi were executed. The results obtained from the tests are presented below.

The parameters and values used for all tests are provided in Table 2.7.

Parameter	Value
functionToOptimize	// Test function
lowerBoundary	// Lower boundary
upperBoundary	// Upper boundary
particleSize	100
c1	0.5
c2	0.5
numberOfIteratıons	50

Table 2.7. Parameters used in PSO test functions

The graph of values generated with the Ackley test function in the PSO algorithm is shown in Figure 2.11.

The range of values and the expected best values for the Ackley test function are given in Table 2.8.

Parameter	Value
Value range	X => [-5, 5], Y => [-5, 5]
Expected best solution	X = 0, Y = 0
Expected optimum value	0
-----------------------------	-------------------------
Obtained solution value(X)	-3.1871733347611965e-08
Obtained solution value(Y)	1.802381914336243e-07
Obtained optimum value	5.177005206746799e-07

Table 2.8. Values obtained from the Ackley test function

The graph of values generated with the Beale test function in the PSO algorithm is shown in Figure 2.12.

The range of values and the expected best values for the Beale test function are given in Table 2.9.

Parameter	Value
Value range	X => [-4.5, 4.5], Y => [-4.5, 4.5]
Expected best solution	X = 3, Y = 0.5
Expected optimum value	0
-----------------------------	-------------------------
Obtained solution value(X)	3.000004103428678
Obtained solution value(Y)	0.5000009276859919
Obtained optimum value	2.881213431705946e-12

Table 2.9. Values obtained from the Beale test function

The graph of values generated with the Goldstein-Price test function in the PSO algorithm is shown in Figure 2.13.

The range of values and the expected best values for the Goldstein-Price test function are given in Table 2.10.

Parameter	Value
Value range	X => [-2, 2], Y => [-2, 2]
Expected best solution	X = 0, Y = -1
Expected optimum value	0
-----------------------------	-------------------------
Obtained solution value(X)	1.443907680869234e-08
Obtained solution value(Y)	-0.999999988551917
Obtained optimum value	3.00000000000008

Table 2.10. Values obtained from the Goldstein-Price test function

The graph of values generated with the Levi test function in the PSO algorithm is shown in Figure 2.14.

The range of values and the expected best values for the Levi test function are given in Table 2.11.

Parameter	Value
Value range	X => [-10, 10], Y => [-10, 10]
Expected best solution	X = 1, Y = 1
Expected optimum value	0
-----------------------------	-------------------------
Obtained solution value(X)	0.9999999871832931
Obtained solution value(Y)	1.0000003902407675
Obtained optimum value	1.6704346415824158e-13

Table 2.11. Values obtained from the Levi test function

The algorithm produced results that closely approached the expected values. Moreover, due to the limitations of data types in Python, increasing the number of particles allows for reaching the global optimum.

3. REFERENCES

Çelik, Y. (2013). Optimizasyon Problemlerinde Bal Arılarının Evlilik Optimizasyonu Algoritmasının Performansının Geliştirilmesi. Doktora Tezi. Konya, Türkiye: Selçuk Üniversitesi Fen Bilimleri Enstitüsü.

Laporte, G. (2006). Classical And Modern Heuristics For The Vehicle Routing Problem. International transactions in operation research, 285-300.

Metropolis N [ve diğerleri] Equa-tion of state calculations by fast computing machines [Dergi] // J. Chem. Phys. - 1953. - Cilt 21. - s. 1087–1092 .

Hatice Erdoğan AKBULUT – Müfredat temelli üniversite ders çizelgeleme problem için bir tavlama benzetimi algoritması 2020 sy. 43-45

Bilgen Ayann KOÇ – Hemşire nöber çizelgeleme probleminin tavlama benzetimi algoritması ile çözümü 2019 sy. 11-18

Hüseyin Ali SARIKAYA– Bütünleşik tedarik zinciri ağında tesis yeri seçimi problem için bulanık çok amaçlı programlama modeline sezgisel bir yaklaşım: Tavlama benzetimi algoritması 2014 sy. 92-94

Güner, Ertan ve Fulya ALTIPARMAK, “İki Ölçütlü Tek Makinali Çizelgeleme Problemi için Sezgisel Bir Yaklaşım” 3003 sy. 27-42

Erhan ÇETİN – Parçacık sürüsü optimizasyonu tabanlı pıd controller ile AA servomotor denetimi 2013 sy. 18-21

Gözde, H., Kocaarslan, İ., Taplamacıoğlu, M.C., Çam, E., 2008. İki bölgeli güç sisteminde parçacık sürüsü algoritması ile yük-frekans kontrolü 55 optimizasyonu. Elektrik-Elektronik ve Bilgisayar Mühendisliği Sempozyumu ELECO’08, 26-30Kasım, Bursa, 212-216.

Yüksel Çelik, İlker Yıldız ve Alper Talha Karadeniz (2019).Avrupa Bilim ve Teknoloji Dergisi Özel Sayı, S. 463-477, Ekim 2019

SOURCES

DISCLAIMER

This document is translated from Turkish to English using ChatGPT. The original document is written by the in Turkish. The translation may contain errors and inaccuracies.