Finding optimal locations of new stores¶
This tutorial includes everything you need to set up IBM Decision Optimization CPLEX Modeling for Python (DOcplex), build a Mathematical Programming model, and get its solution by solving the model on the cloud with IBM ILOG CPLEX Optimizer.
When you finish this tutorial, you’ll have a foundational knowledge of Prescriptive Analytics.
This notebook is part of Prescriptive Analytics for Python.
It requires a valid subscription to Decision Optimization on Cloud. Try it for free here.
Table of contents:
- Describe the business problem
- How decision optimization can help
- Use decision optimization
- Summary
Describe the business problem¶
- A fictional Coffee Company plans to open N shops in the near future
and needs to determine where they should be located knowing the
following:
- Most of the customers of this coffee brewer enjoy reading and borrowing books, so the goal is to locate those shops in such a way that all the city public libraries are within minimal walking distance.
- We use Chicago open data for this example.
- We implement a K-Median model to get the optimal location of our future shops.
How decision optimization can help¶
Prescriptive analytics (decision optimization) technology recommends actions that are based on desired outcomes. It takes into account specific scenarios, resources, and knowledge of past and current events. With this insight, your organization can make better decisions and have greater control of business outcomes.
Prescriptive analytics is the next step on the path to insight-based actions. It creates value through synergy with predictive analytics, which analyzes data to predict future outcomes.
- Prescriptive analytics takes that insight to the next level by suggesting the optimal way to handle that future situation. Organizations that can act fast in dynamic conditions and make superior decisions in uncertain environments gain a strong competitive advantage.
With prescriptive analytics, you can:
- Automate the complex decisions and trade-offs to better manage your limited resources.
- Take advantage of a future opportunity or mitigate a future risk.
- Proactively update recommendations based on changing events.
- Meet operational goals, increase customer loyalty, prevent threats and fraud, and optimize business processes.
Use decision optimization¶
Step 1: Download the library¶
Run the following code to install the Decision Optimization CPLEX Modeling library. The DOcplex library contains the two modeling packages, Mathematical Programming and Constraint Programming, referred to earlier.
import sys
try:
import docplex.mp
except:
if hasattr(sys, 'real_prefix'):
#we are in a virtual env.
!pip install docplex
else:
!pip install --user docplex
Note that the more global package docplex contains another subpackage docplex.cp that is dedicated to Constraint Programming, another branch of optimization.
Step 2: Set up the prescriptive engine¶
- Subscribe to the Decision Optimization on Cloud solve service.
- Get the service URL and your personal API key, and enter your credentials here:
url = "ENTER YOUR URL HERE"
key = "ENTER YOUR KEY HERE"
Step 3: Model the data¶
- The data for this problem is quite simple: it is composed of the list of public libraries and their geographical locations.
- The data is acquired from Chicago open data as a JSON file, which is in the following format: data” : [ [ 1, “13BFA4C7-78CE-4D83-B53D-B57C60B701CF”, 1, 1441918880, “885709”, 1441918880, “885709”, null, “Albany Park”, “M, W: 10AM-6PM; TU, TH: 12PM-8PM; F, SA: 9AM-5PM; SU: Closed”, “Yes”, “Yes ”, “3401 W. Foster Avenue”, “CHICAGO”, “IL”, “60625”, “(773) 539-5450”, [ “http://www.chipublib.org/locations/1/”, null ], [ null, “41.975456”, “-87.71409”, null, false ] ] This code snippet represents library “3401 W. Foster Avenue” located at 41.975456, -87.71409
Step 4: Prepare the data¶
We need to collect the list of public libraries locations and keep their names, latitudes, and longitudes.
# Store longitude, latitude and street crossing name of each public library location.
class XPoint(object):
def __init__(self, x, y):
self.x = x
self.y = y
def __str__(self):
return "P(%g_%g)" % (self.x, self.y)
class NamedPoint(XPoint):
def __init__(self, name, x, y):
XPoint.__init__(self, x, y)
self.name = name
def __str__(self):
return self.name
Define how to compute the earth distance between 2 points¶
To easily compute distance between 2 points, we use the Python package geopy
try:
import geopy.distance
except:
if hasattr(sys, 'real_prefix'):
#we are in a virtual env.
!pip install geopy
else:
!pip install --user geopy
# Simple distance computation between 2 locations.
from geopy.distance import great_circle
def get_distance(p1, p2):
return great_circle((p1.y, p1.x), (p2.y, p2.x)).miles
Declare the list of libraries¶
Parse the JSON file to get the list of libraries and store them as Python elements.
def build_libraries_from_url(url, name_pos, lat_long_pos):
import requests
import json
r = requests.get(url)
myjson = json.loads(r.text, parse_constant='utf-8')
myjson = myjson['data']
libraries = []
k = 1
for location in myjson:
uname = location[name_pos]
try:
latitude = float(location[lat_long_pos][1])
longitude = float(location[lat_long_pos][2])
except TypeError:
latitude = longitude = None
try:
name = str(uname)
except:
name = "???"
name = "P_%s_%d" % (name, k)
if latitude and longitude:
cp = NamedPoint(name, longitude, latitude)
libraries.append(cp)
k += 1
return libraries
libraries = build_libraries_from_url('https://data.cityofchicago.org/api/views/x8fc-8rcq/rows.json?accessType=DOWNLOAD',
name_pos=12,
lat_long_pos=18)
print("There are %d public libraries in Chicago" % (len(libraries)))
There are 80 public libraries in Chicago
Define number of shops to open¶
Create a constant that indicates how many coffee shops we would like to open.
nb_shops = 5
print("We would like to open %d coffee shops" % nb_shops)
We would like to open 5 coffee shops
Validate the data by displaying them¶
We will use the folium library to display a map with markers.
try:
import folium
except:
if hasattr(sys, 'real_prefix'):
#we are in a virtual env.
!pip install folium
else:
!pip install --user folium
import folium
map_osm = folium.Map(location=[41.878, -87.629], zoom_start=11)
for library in libraries:
lt = library.y
lg = library.x
folium.Marker([lt, lg]).add_to(map_osm)
map_osm
After running the above code, the data is displayed but it is impossible to determine where to ideally open the coffee shops by just looking at the map.
Let’s set up DOcplex to write and solve an optimization model that will help us determine where to locate the coffee shops in an optimal way.
Step 5: Set up the prescriptive model¶
from docplex.mp.environment import Environment
env = Environment()
env.print_information()
* system is: Windows 64bit
* Python is present, version is 2.7.11
* docplex is present, version is (1, 0, 0)
Create the DOcplex model¶
The model contains all the business constraints and defines the objective.
from docplex.mp.model import Model
mdl = Model("coffee shops")
Define the decision variables¶
BIGNUM = 999999999
# Ensure unique points
libraries = set(libraries)
# For simplicity, let's consider that coffee shops candidate locations are the same as libraries locations.
# That is: any library location can also be selected as a coffee shop.
coffeeshop_locations = libraries
# Decision vars
# Binary vars indicating which coffee shop locations will be actually selected
coffeeshop_vars = mdl.binary_var_dict(coffeeshop_locations, name="is_coffeeshop")
#
# Binary vars representing the "assigned" libraries for each coffee shop
link_vars = mdl.binary_var_matrix(coffeeshop_locations, libraries, "link")
Express the business constraints¶
First constraint: if the distance is suspect, it needs to be excluded from the problem.
for c_loc in coffeeshop_locations:
for b in libraries:
if get_distance(c_loc, b) >= BIGNUM:
mdl.add_constraint(link_vars[c_loc, b] == 0, "ct_forbid_{0!s}_{1!s}".format(c_loc, b))
Second constraint: each library must be linked to a coffee shop that is open.
mdl.add_constraints(link_vars[c_loc, b] <= coffeeshop_vars[c_loc]
for b in libraries
for c_loc in coffeeshop_locations)
mdl.print_information()
Model: coffee shops
- number of variables: 6480
- binary=6480, integer=0, continuous=0
- number of constraints: 6400
- LE=6400, EQ=0, GE=0, RNG=0
- parameters: defaults
Third constraint: each library is linked to exactly one coffee shop.
mdl.add_constraints(mdl.sum(link_vars[c_loc, b] for c_loc in coffeeshop_locations) == 1
for b in libraries)
mdl.print_information()
Model: coffee shops
- number of variables: 6480
- binary=6480, integer=0, continuous=0
- number of constraints: 6480
- LE=6400, EQ=80, GE=0, RNG=0
- parameters: defaults
Fourth constraint: there is a fixed number of coffee shops to open.
# Total nb of open coffee shops
mdl.add_constraint(mdl.sum(coffeeshop_vars[c_loc] for c_loc in coffeeshop_locations) == nb_shops)
# Print model information
mdl.print_information()
Model: coffee shops
- number of variables: 6480
- binary=6480, integer=0, continuous=0
- number of constraints: 6481
- LE=6400, EQ=81, GE=0, RNG=0
- parameters: defaults
Express the objective¶
The objective is to minimize the total distance from libraries to coffee shops so that a book reader always gets to our coffee shop easily.
# Minimize total distance from points to hubs
total_distance = mdl.sum(link_vars[c_loc, b] * get_distance(c_loc, b) for c_loc in coffeeshop_locations for b in libraries)
mdl.minimize(total_distance)
Solve with the Decision Optimization solve service¶
Solve the model on the cloud.
print("# coffee shops locations = %d" % len(coffeeshop_locations))
print("# coffee shops = %d" % nb_shops)
assert mdl.solve(url=url, key=key), "!!! Solve of the model fails"
# coffee shops locations = 80
# coffee shops = 5
Step 6: Investigate the solution and then run an example analysis¶
The solution can be analyzed by displaying the location of the coffee shops on a map.
total_distance = mdl.objective_value
open_coffeeshops = [c_loc for c_loc in coffeeshop_locations if coffeeshop_vars[c_loc].solution_value == 1]
not_coffeeshops = [c_loc for c_loc in coffeeshop_locations if c_loc not in open_coffeeshops]
edges = [(c_loc, b) for b in libraries for c_loc in coffeeshop_locations if int(link_vars[c_loc, b]) == 1]
print("Total distance = %g" % total_distance)
print("# coffee shops = {0}".format(len(open_coffeeshops)))
for c in open_coffeeshops:
print("new coffee shop: {0!s}".format(c))
Total distance = 209.812
# coffee shops = 5
new coffee shop: P_6100 W. Irving Park Road_5
new coffee shop: P_6 S. Hoyne Avenue_46
new coffee shop: P_4455 N. Lincoln Avenue_65
new coffee shop: P_9525 S. Halsted Street_79
new coffee shop: P_2111 W. 47th Street_7
Displaying the solution¶
Coffee shops are highlighted in red.
import folium
map_osm = folium.Map(location=[41.878, -87.629], zoom_start=11)
for coffeeshop in open_coffeeshops:
lt = coffeeshop.y
lg = coffeeshop.x
folium.Marker([lt, lg], icon=folium.Icon(color='red',icon='info-sign')).add_to(map_osm)
for b in libraries:
if b not in open_coffeeshops:
lt = b.y
lg = b.x
folium.Marker([lt, lg]).add_to(map_osm)
for (c, b) in edges:
coordinates = [[c.y, c.x], [b.y, b.x]]
map_osm.add_children(folium.PolyLine(coordinates, color='#FF0000', weight=5))
map_osm
Summary¶
You learned how to set up and use IBM Decision Optimization CPLEX Modeling for Python to formulate a Mathematical Programming model and solve it with IBM Decision Optimization on Cloud.
References¶
- CPLEX Modeling for Python documentation
- Decision Optimization on Cloud
- Need help with DOcplex or to report a bug? Please go here.
- Contact us at dofeedback@wwpdl.vnet.ibm.com.