Probabilistic Traveling Salesman Problem and Harmony Search Algorithms in Pharmacy Supply Optimization

This paper demonstrates the utilitarian significance of the Probabilistic Traveling Salesman Problem (PTSP) in planning travel routes by companies which provide distribution services for phar‐ macies, with a particular consideration of variable customer demand. The optimization problem was solved using the Harmony Search (HS) algorithm, thus verifying its utility based on one real instance of PTSP (representing the problem of pharmacy supply reliability) and three tasks from the public TSPLIB library (adjusted to PTSP). As a result of the conducted research, significant utility of the hybrid approach was identified, assuming the combination of HS with popular 2‐opt method, which enabled achievement of good results within acceptable period (in practical applications).


Introduction
The functioning of retail pharmacies in Poland is increasingly more connected with efficient distribution of prescription drugs. The system of drug distribution in Poland consists of intense distribution in relatively short (2-3 links), but wide distribution channels (Szołtysek, 2016). The retail pharmacy market in Poland at the end of 2017 comprised of almost 13,300 retail pharmacies, almost 1,300 pharmaceutical outlets (GUS, 2017) and 280 pharmacies using electronic channels (Rynek aptek, 2018). Both in 2017 and in the forecasts for 2018, there is a noticeable growing tendency in the total number of public pharmacies, which additionally complicates the distribution system by expanding its channels, whereby the increase of the number of pharmacies operating within networks is accompanied by a drop in the number of independent pharmacies, which at the end of 2017 constituted 57% of the total number of pharmacies and were responsible for 41% of the value of sales in the pharmaceutical market (IQVIA, 2017). Such a structure is connected with a higher diversification of distribution systems, causing a lot of complications with access to drugs, additionally increasing the competition between different types of pharmacies and enabling the introduction of dynamic stock management in network pharmacies. These are only some of the reasons for which, despite the increase in turnover of pharmacies calculated in retail prices by 4.9%, the profitability of pharmacies on a year-to-year basis decreases alarmingly (IQVIA, 2017). Therefore, the managers face the necessity of searching for new business models of strategic issues, such as the creation of values (Zott, Amit, Massa, 2011) for this particular type of activity, in order to increase their chance of success and minimise the associated risks. One of the main risks borne by pharmacies is the pricing policy for prescription drugs (Rx), which assumes that drug reimbursement is determined six times a year; consequently, the drug price for patients is also determined six times a year, in the amount of reimbursement, with rigid margins and surcharges. In the opinion of specialists, on one hand, this causes an increase of patient co-payment (due to which patients search for cheaper substitutes of prescribed drugs or, in extreme cases, a withdrawal from treatment; an increase of parallel import has also been recorded), on the other hand, this leads to a lower logistics availability of drugs in pharmacies (pharmacies reduce the risk of price changes by minimising the stocks of drugs and decreasing the profitability of pharmacies), increased intensity of distribution and deterioration of financial situation of distribution channels (drug reimbursement system in Poland). Currently, pharmacies are supplied with drugs on the basis of daily deliveries from pharmaceutical wholesalers, using specialist refrigerating rolling stock. Companies providing distribution (transport) services have demarcated operating areas in which they serve a few dozen pharmacies with variable demand. It frequently happens that: (1) not all served pharmacies always order drugs, (2) a pharmacy decides not to have the ordered drug delivered when the route service is already being executed. Such situations cause a risk of deviation from the previously optimised service plan (including the route travel), thus deteriorating the level of logistics service and, usually, generating higher distribution costs. Therefore, the reliability of logistics service in such cases is compromised. While searching for remedies to this type of phenomenon (increasingly more common in the operating practice), the authors drew attention to the potential of the Probabilistic Traveling Salesman Problem (PTSP), which may be used for modelling tasks in logistics, with a particular consideration of transport processes, such as daily delivery of goods with random demand (Liu, 2007).
PTSP is an NP-hard problem (Kiełkowicz, Kokosiński, 2012), which was formulated in the paper Jaillet (1985). Due to the nature of the issue, even solution of its relatively small instances requires the application of heuristic approach. Among the ways of managing tasks in the field of PSTP present in the literature on the subject, we may distinguish the use of classic heuristics (Weiler et al., 2015) and metaheuristics, such as: Evolutionary Algorithm hybrid with local search methods (Kiełkowicz, Kokosiński, 2012), Hybrid Scatter Search (Liu, 2007) and probabilistic Ant Colony System (Bianchi, Gambardella, Dorigo, 2002).
Taking into consideration the achievement of relatively good results by the Harmony Search (HS) algorithm for the instances of the Asymmetric Traveling Salesman Problem  and inability to identify papers in the field of verification of HS effectiveness for PTSP, we decided to eliminate the research gap consisting in determination of utility of the proposed technique for solving utilitarian problems modelled by PTSP. The article is also intended to assess the relevance of combination of HS hybridisation with the popular local search method, i.e. 2-opt (intended to improve the quality of obtained results), while simultaneously considering the possibility to use combined algorithms in practical applications (through the assessment of program execution time in comparison with the obtained reduction of the value of objective function).
The article consists of the following sections: introduction to the raised subject, description of PTSP, analysis of HS structure adjusted for solving PTSP, presentation of the methodology of research, as well as presentation and interpretation of the obtained results. The paper ends with conclusions and planned further directions for development of the proposed method.

Probabilistic Traveling Salesman Problem
PTSP can be interpreted as a Traveling Salesman Problem (TSP), in which for every i node certain p i probability is assigned, representing the necessity to visit the node by a commercial agent. In case when for every i-th customer the p i value amounts to 1, PTSP is equivalent to classic TSP (Kiełkowicz, Kokosiński, 2012). The issue assumes determination of route a priori in such manner as to include all reception points, and at the same time minimise the average route length, assuming visit at the selected subset of nodes in the same order as they would appear in the created plan. PTSP shall be applicable in a situation when the route is planned once for a longer period, whereas the travel is executed with high frequency and changing need to visit particular customers during each journey, with simultaneous lack of possibility or willingness to perform repeated optimisation (Jaillet, 1988).
The paper was based on formulation of PTSP which was presented in Bianchi, Gambardella and Dorigo (2002) and Bowler, Fink and Ball (2003). According to this, presence of completely connected, weighted graph with nodes (representing reception points) from the set V = {i = 1, 2, …, n} was adapted. Each i-th customer was described with the probability of necessity to be visited p i , which was independent from the other values characterising particular vertices. The task consists in determination of λ route consisting of all nodes belonging to set V (a priori route), characterized by a minimal value of the objective function expressed with the following formula: where S means a subset of nodes from set V, L λ (S) is the length of route consisting of points belonging to S (the sequence of visiting nodes is identical to the sequence of appearance of requested points on the a priori route), whereas p(S) is the probability that all customers belonging to S must be visited. It is described with the following formula: The expected length of a priori route λ = (1, 2, …, n) amounts to: where d ij means weight of the edge connecting i and j nodes.
In case when all customers are described with the same p probability (p i = p for each point i ∈ V), a special case of PTSP can be specified -homogeneous PTSP whose practical applications can be recorded as important advantages (frequently various places are described with the same possibility of visiting). For homogeneous PTSP, the formula (3) can be expressed as:

Harmony Search algorithm for PTSP
HS is interesting metaheuristics which was first described in the PhD thesis (Geem, 2000). It is based on the similarity between the process of searching for optimum by algorithmic methods and jazz improvisation. The technique assumes the existence of harmony memory HM which stores the HMS (harmony memory size) of harmonies (interpreted as complete problem solutions). Each element that constitutes HM has a specific number of pitches whose values correspond to the values of decisive variables of a particular result. The algorithm assumes the iterative development of new harmonies and comparison of their values of objective function with the relevant parameter describing the worst solution located in HM -if the new result is more favourable, it replaces the worst element located in HM and harmony memory is sorted again (the harmony in the first position should be described by the most favourable value of objective function). The technique finishes its execution after performing IT iterations. Two parameters, i.e. HMCR (harmony memory consideration rate; HMCR ∈ [0,1]) and PAR (pitch adjustment rate; PAR ∈ [0, 1]), are used during the creation of a new harmony. The first of them is responsible for the probability of using knowledge gathered in the harmony memory, which is implemented through selection of the i pitch value on the basis of the values of pitches located in i position in harmonies belonging to HM. Otherwise (with the probability of 1 -HMCR), random selection of i pitch value is performed using the available scope. Additionally, with the probability equal to PAR, modification of the value, selected from the harmony memory, may be performed on the basis of bw parameter (the operation was described in detail e.g. in paper Hetmaniok et al., 2011).
According to Yang (2009) a skilled musician has three choices, while he improvises: he could play something from his memory, he could play something similar to a known piece or he could compose something new. All possibilities are reflected in the HS and are implemented on the basis of HMCR, PAR and bw parameters.
This article is based on the approach to designing HS which was presented in paper . It assumes description of pitches comprising particular harmonies by integers corresponding to the numbers of particular cities that are to be visited by the salesman (their sequence of appearance corresponds to the sequence in which the commercial agent is to execute the travel). The selection of k pitch in new H harmony on the basis of knowledge gathered in HM -related to HMCR probability -is performed through selection made within the nodes located after the last city added to H, in solutions located in HM (it was assumed that sequence is more important for TSP than the location of a particular city on the route). In the case when the list of available cities appearing in the harmonies belonging to HM directly after the city located in k -1 position in H is empty, a randomly acceptable node is selected.

Figure 1. HS and HHS pseudocodes
Source: own study on the basis of  Additionally, the quality of solutions represented by particular harmonies in the process of travel plan construction was taken into consideration in such manner as to promote better results (the selection of values for k pitch is executed on the basis of the popular roulette wheel method, which is based on the value of objective function of solutions). The modification of pitch value, depending on PAR, is based on selection of the nearest available node, from the last city visited by the salesman (the effectiveness of the described approach for PTSP was referred to in paper Kiełkowicz and Kokosiński, 2012), at the same time eliminating the necessity of choosing the proper value of bw parameter. In order to avoid premature convergence, caused by getting stuck in the local optimum, the mechanism of resetting HM elements (with the exception of the best harmony) was introduced at the moment of executing R iterations from the last result replacement located in the last position in the harmony memory.
This paper also proposes the approach assuming an increase of HS effectiveness through the improvement of results generated by the algorithm, using popular method 2-opt (the paper  indicates the necessity of combining HS with local search techniques in order to improve its exploitation efficiency), thus creating Hybrid Harmony Search (HHS). The pseudocode of both methods (HS and HHS) is presented in Figure 1.

Methodology of research
Our 'test bed' consists of one instance describing the real problem of delivering goods from a wholesale drug warehouse to pharmacies located in the territory of Poland (the test was named real31 -with 31 nodes appearing -and for p = 1, it represents a symmetric variant of TSP; the data were made available by a company, whereas the edge weights were adapted as lengths of routes expressed in metres and determined by means of Google Maps) and three tasks representing a symmetric variant of TSP (bays29, berlin52 and eil101), which are located in the TSPLIB library and characterised by the occurrence of between 29 and 101 nodes (thus enabling verification of the effectiveness of proposed algorithms for tasks of similar size to PTSP instances describing many practical applications). Detailed characteristics of the 'test bed' are shown in Table 1. The appearance of homogeneous PTSP variants was assumed (determination of the value of objective function was based on formula (4)) and value p ∈ {0.75; 0.8; 0.9; 0.95; 1}.
The algorithms were implemented in C# language, whereas the calculations were performed using laptop Lenovo Y520, with the following parameter values: Intel Core i7-7700HQ (4 cores, from 2.8 GHz to 3.8 GHz, 6 MB cache), 32 GB RAM (SO-DIMM DDR4, 2400MHz), 1000 GB SATA 7200 RPM, 240 GB SSD M.2 PCIe and Windows 10 Home 64-bit. The following values of HS and HHS parameters were determined on the basis of  paper: HMS = 5, HMCR = 0.98, PAR = 0.25 and R = 1000. Based on the analysis of the number of iterations after which HS reached convergence for various instances of asymmetric variant of TSP (presented in article Boryczka and Szwarc, 2019), it was assumed that IT = 1,000,000.
Due to the nondeterministic nature of the examined methods, each task was solved 30 times. The obtained value of objective function was subject to obj (its average value was marked as obj , whereas the standard deviation from the sample was marked as σ obj ), expressed in seconds; time taken to reach convergence (create the best harmony) was marked as t c (its average value was marked as c t , whereas the standard deviation from the sample was marked as ó c t ) and expressed in seconds; time to execute the algorithm t e (its average value was marked as e t , whereas the standard deviation from the sample was marked as e t s ). The effectiveness of the proposed HS and HHS algorithms were determined by comparing the obtained results with the results created by 2-opt technique, based on the pseudo-randomly generated initial solution.

Obtained results
Tables 2, 3 and 4 present the results determined by 2-opt, HS and HHS for tasks from the analysed 'test bed', accordingly. On their basis, significant effectiveness of the proposed approach to designing HS (in comparison with the 2-opt technique), as well as occurrence of noticeable benefits occurring as a result of combination of both methods (HHS), was identified. The significant differences between the time to reach convergence and the time to perform the method for HS (Table 3) indicate the possibility of determining an excessively huge IT value and, as a result, allow the reduction of the time required for method performance, while at the same time maintaining the quality of constructed travel routes. Particular attention should be drawn to the occurrence of a small difference between c t and e t in HHS results for tests berlin52 and eil101 (Table 4), showing the effectiveness of hybridisation, thanks to which, in a majority of cases, the result determined by HS could be improved (due to the small number of nodes for test bays29, hybridisation did not improve the results). Both HS and HHS are characterised by a relatively short execution time and small variability, whereas the launch of 2-opt technique, based on the solution determined by HS, did not cause a noticeable extension of the method execution time (the higher values e t occurring in the results for certain tasks and lower obj for HS -in comparison with HHS -occurred as a result of non-determinism of methods and error resulting from the measurement of the real execution time). In summary, it is recommended to use the proposed HHS to solve PTSP instances. It is worth noting the decrease of HS effectiveness (in comparison with 2-opt) for the largest instance, confirming the observation referred to in  paper, according to which in the analysed approach to designing HS, there is a weak mechanism of exploitation (as a result, HS needs to be improved by, for example, hybridizing with local search algorithms). The relatively high value σ obj for 2-opt indicates a significant impact of the quality of initial solution on the value of objective function of the results determined by the technique (it is therefore recommended to run the method on a relatively good base solution).   The values of objective function of solutions, determined by particular methods, were subject to Wilcoxon Signed-Rank, with the alternative hypothesis assuming that A1 results were described by a lower value of obj than A2. The value of 0.05 was assumed as the level of significance (lower p-values indicate the acceptance of an alternative hypothesis). The obtained p-values are presented in Table 5. On their basis, it was concluded that the alternative hypothesis should be accepted, according to which HS constructed better routes than 2-opt, whereas HHS constructed better routes than HS and 2-opt.  Figure 2 presents box graphs for the number of iterations after which HS achieved convergence for particular instances of PTSP. The following regularities were identified on its basis: a noticeable dependence between the described number of iterations and the size of the problem (the necessary number of iterations grew simultaneously with the increase of the number of nodes describing the task), as well as a high variability of the values of the analysed parameter, which indicates a significant impact of non-determinism. Figure 3 presents a portion of the HS convergence graph for eil101 task and value p = 1 (the change of the value of objective function of the best harmony was presented, recorded during performance of the first 500 iterations). On this basis and analysis of dependencies for other tasks, the occurrence of significant dynamics of changes at a very early stage of algorithm execution, as well as the gradual slowdown of the solution improvement in the later period, were identified.

Conclusions and further work
This paper has eliminated an existing research gap through determination of effectiveness of the proposed HS algorithm adjusted for solving PTSP instances. Based on the results of the conducted research, the possibility to apply the method in practical applications was demonstrated, in particular for the instances modelling the utilitarian problem of maintaining the reliability of deliveries to pharmacies, with the growing tendency for changes in demand for drugs, causing deviations from the service of particular nodes in the delivery network, and therefore -changes in the previously determined routes, leading to a number of unfavourable temporal and spatial consequences. The analysis of the impact of 2-opt technique on the quality of determined solutions allowed to formulate the conclusion according to which it is recommended to perform HS hybridisation with the 2-opt method (or its variety enabling a reduction of the computational complexity, i.e. 2-p-opt (Kiełkowicz, Kokosiński, 2012) -after performing a specific number of iterations IT -in particular for the tasks described by a relatively high number of nodes (min-imum 52). Despite a slight increase (in comparison with HS) of the time to run the hybrid algorithm, the obtained results are characterised by a noticeably more favourable value of objective function (the improvement was caused by the improvement of the exploitation efficiency of HS), therefore enabling implementation of the solution in practical applications.
Further work on the adjustment of HS to the issue discussed in this article may include the use of other local search techniques (e.g. 1-shift method, which was used in Kiełkowicz and Kokosiński (2012) paper) and the analysis of the impact of placing the mechanism for improving the exploitation in a different algorithm place (e.g. after generating the initial content of HM or in HS loop).