Use Facebook prophet, sklearn, python and the Chicago crime dataset to predict crime rates in Chicago.

About:

This project / case study is for phase 1 of my 100 days of machine learning code challenge.

This is a homework solution to a section in Deep Learning and Machine Learning Practicakl Workouts.

Problem Statement:

Forecast Chicago crime rate

Technology used:

Dataset(s):

Libraries:

Resources:

Contact:

If for any reason you would like to contact me please do so at the following:

Use fbprophet to predict crime in chicago

Import data

In [6]:
#import libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

from fbprophet import Prophet
In [7]:
# Import csv data into dataframes
chicago_df_1 = pd.read_csv('../datasets/chicago/Chicago_Crimes_2005_to_2007.csv',
                           error_bad_lines = False)
chicago_df_2 = pd.read_csv('../datasets/chicago/Chicago_Crimes_2008_to_2011.csv',
                           error_bad_lines = False)
chicago_df_3 = pd.read_csv('../datasets/chicago/Chicago_Crimes_2012_to_2017.csv',
                           error_bad_lines = False)
                        
b'Skipping line 533719: expected 23 fields, saw 24\n'
b'Skipping line 1149094: expected 23 fields, saw 41\n'
In [8]:
print(chicago_df_1.shape)
print(chicago_df_2.shape)
print(chicago_df_3.shape)
(1872343, 23)
(2688710, 23)
(1456714, 23)
In [9]:
# Concatinate Data Frames
chicago_df = pd.concat([chicago_df_1,
                        chicago_df_2,
                        chicago_df_3])
In [13]:
chicago_df.shape
Out[13]:
(6017767, 23)
In [14]:
chicago_df.head()
Out[14]:
Unnamed: 0 ID Case Number Date Block IUCR Primary Type Description Location Description Arrest ... Ward Community Area FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location
0 0 4673626 HM274058 04/02/2006 01:00:00 PM 055XX N MANGO AVE 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE RESIDENCE False ... 45.0 11.0 26 1136872.0 1936499.0 2006 04/15/2016 08:55:02 AM 41.981913 -87.771996 (41.981912692, -87.771996382)
1 1 4673627 HM202199 02/26/2006 01:40:48 PM 065XX S RHODES AVE 2017 NARCOTICS MANU/DELIVER:CRACK SIDEWALK True ... 20.0 42.0 18 1181027.0 1861693.0 2006 04/15/2016 08:55:02 AM 41.775733 -87.611920 (41.775732538, -87.611919814)
2 2 4673628 HM113861 01/08/2006 11:16:00 PM 013XX E 69TH ST 051A ASSAULT AGGRAVATED: HANDGUN OTHER False ... 5.0 69.0 04A 1186023.0 1859609.0 2006 04/15/2016 08:55:02 AM 41.769897 -87.593671 (41.769897392, -87.593670899)
3 4 4673629 HM274049 04/05/2006 06:45:00 PM 061XX W NEWPORT AVE 0460 BATTERY SIMPLE RESIDENCE False ... 38.0 17.0 08B 1134772.0 1922299.0 2006 04/15/2016 08:55:02 AM 41.942984 -87.780057 (41.942984005, -87.780056951)
4 5 4673630 HM187120 02/17/2006 09:03:14 PM 037XX W 60TH ST 1811 NARCOTICS POSS: CANNABIS 30GMS OR LESS ALLEY True ... 13.0 65.0 18 1152412.0 1864560.0 2006 04/15/2016 08:55:02 AM 41.784211 -87.716745 (41.784210853, -87.71674491)

5 rows × 23 columns

In [15]:
chicago_df.tail()
Out[15]:
Unnamed: 0 ID Case Number Date Block IUCR Primary Type Description Location Description Arrest ... Ward Community Area FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location
1456709 6250330 10508679 HZ250507 05/03/2016 11:33:00 PM 026XX W 23RD PL 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT True ... 28.0 30.0 08B 1159105.0 1888300.0 2016 05/10/2016 03:56:50 PM 41.849222 -87.691556 (41.849222028, -87.69155551)
1456710 6251089 10508680 HZ250491 05/03/2016 11:30:00 PM 073XX S HARVARD AVE 1310 CRIMINAL DAMAGE TO PROPERTY APARTMENT True ... 17.0 69.0 14 1175230.0 1856183.0 2016 05/10/2016 03:56:50 PM 41.760744 -87.633335 (41.760743949, -87.63333531)
1456711 6251349 10508681 HZ250479 05/03/2016 12:15:00 AM 024XX W 63RD ST 041A BATTERY AGGRAVATED: HANDGUN SIDEWALK False ... 15.0 66.0 04B 1161027.0 1862810.0 2016 05/10/2016 03:56:50 PM 41.779235 -87.685207 (41.779234743, -87.685207125)
1456712 6253257 10508690 HZ250370 05/03/2016 09:07:00 PM 082XX S EXCHANGE AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE SIDEWALK False ... 7.0 46.0 08B 1197261.0 1850727.0 2016 05/10/2016 03:56:50 PM 41.745252 -87.552773 (41.745251975, -87.552773464)
1456713 6253474 10508692 HZ250517 05/03/2016 11:38:00 PM 001XX E 75TH ST 5007 OTHER OFFENSE OTHER WEAPONS VIOLATION PARKING LOT/GARAGE(NON.RESID.) True ... 6.0 69.0 26 1178696.0 1855324.0 2016 05/10/2016 03:56:50 PM 41.758309 -87.620658 (41.75830866, -87.620658418)

5 rows × 23 columns

Explore Data

In [16]:
chicago_df.head(20)
Out[16]:
Unnamed: 0 ID Case Number Date Block IUCR Primary Type Description Location Description Arrest ... Ward Community Area FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location
0 0 4673626 HM274058 04/02/2006 01:00:00 PM 055XX N MANGO AVE 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE RESIDENCE False ... 45.0 11.0 26 1136872.0 1936499.0 2006 04/15/2016 08:55:02 AM 41.981913 -87.771996 (41.981912692, -87.771996382)
1 1 4673627 HM202199 02/26/2006 01:40:48 PM 065XX S RHODES AVE 2017 NARCOTICS MANU/DELIVER:CRACK SIDEWALK True ... 20.0 42.0 18 1181027.0 1861693.0 2006 04/15/2016 08:55:02 AM 41.775733 -87.611920 (41.775732538, -87.611919814)
2 2 4673628 HM113861 01/08/2006 11:16:00 PM 013XX E 69TH ST 051A ASSAULT AGGRAVATED: HANDGUN OTHER False ... 5.0 69.0 04A 1186023.0 1859609.0 2006 04/15/2016 08:55:02 AM 41.769897 -87.593671 (41.769897392, -87.593670899)
3 4 4673629 HM274049 04/05/2006 06:45:00 PM 061XX W NEWPORT AVE 0460 BATTERY SIMPLE RESIDENCE False ... 38.0 17.0 08B 1134772.0 1922299.0 2006 04/15/2016 08:55:02 AM 41.942984 -87.780057 (41.942984005, -87.780056951)
4 5 4673630 HM187120 02/17/2006 09:03:14 PM 037XX W 60TH ST 1811 NARCOTICS POSS: CANNABIS 30GMS OR LESS ALLEY True ... 13.0 65.0 18 1152412.0 1864560.0 2006 04/15/2016 08:55:02 AM 41.784211 -87.716745 (41.784210853, -87.71674491)
5 6 4673631 HM263167 03/30/2006 10:30:00 PM 014XX W 73RD PL 0560 ASSAULT SIMPLE APARTMENT True ... 17.0 67.0 08A 1167688.0 1855998.0 2006 04/15/2016 08:55:02 AM 41.760401 -87.660982 (41.760401372, -87.660982392)
6 7 4673632 HM273234 04/05/2006 12:10:00 PM 050XX N LARAMIE AVE 0460 BATTERY SIMPLE SCHOOL, PUBLIC, BUILDING True ... 45.0 11.0 08B 1140791.0 1932993.0 2006 04/15/2016 08:55:02 AM 41.972221 -87.757670 (41.972220564, -87.75766982)
7 8 4673633 HM275105 04/05/2006 03:00:00 PM 067XX S ROCKWELL ST 0820 THEFT $500 AND UNDER STREET False ... 15.0 66.0 06 1160205.0 1859776.0 2006 04/15/2016 08:55:02 AM 41.770926 -87.688304 (41.770925978, -87.688304107)
8 9 4673634 HM275063 04/05/2006 09:30:00 PM 019XX W CHICAGO AVE 0560 ASSAULT SIMPLE PARKING LOT/GARAGE(NON.RESID.) False ... 32.0 24.0 08A 1163122.0 1905349.0 2006 04/15/2016 08:55:02 AM 41.895923 -87.676334 (41.895922672, -87.676333733)
9 10 4673635 HM268513 04/03/2006 03:00:00 AM 063XX S EBERHART AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE SIDEWALK False ... 20.0 42.0 08B 1180669.0 1863047.0 2006 04/15/2016 08:55:02 AM 41.779456 -87.613191 (41.77945628, -87.613190628)
10 11 4673637 HM275073 04/06/2006 11:15:00 AM 0000X N LA SALLE ST 0810 THEFT OVER $500 STREET False ... 42.0 32.0 06 1175135.0 1900412.0 2006 04/15/2016 08:55:02 AM 41.882114 -87.632361 (41.882114362, -87.632361012)
11 12 4673639 HM272124 04/04/2006 08:15:00 PM 029XX S FEDERAL ST 1350 CRIMINAL TRESPASS TO STATE SUP LAND CHA HALLWAY/STAIRWELL/ELEVATOR True ... 3.0 35.0 26 1176025.0 1885766.0 2006 04/15/2016 08:55:02 AM 41.841905 -87.629534 (41.841904764, -87.629533842)
12 13 4673640 HM275082 04/06/2006 11:30:00 AM 017XX E 86TH PL 0935 MOTOR VEHICLE THEFT THEFT/RECOVERY: TRUCK,BUS,MHOME STREET False ... 8.0 45.0 07 1189375.0 1847970.0 2006 04/15/2016 08:55:02 AM 41.737879 -87.581757 (41.737879171, -87.581756795)
13 14 4673642 HM202299 02/26/2006 02:47:21 PM 002XX S LEAMINGTON AVE 1811 NARCOTICS POSS: CANNABIS 30GMS OR LESS SIDEWALK True ... 28.0 25.0 18 1142168.0 1898610.0 2006 04/15/2016 08:55:02 AM 41.877845 -87.753461 (41.87784456, -87.753461293)
14 15 4673643 HM270077 04/03/2006 08:09:00 PM 073XX S WOODLAWN AVE 0420 BATTERY AGGRAVATED:KNIFE/CUTTING INSTR SIDEWALK False ... 5.0 69.0 04B 1185483.0 1856655.0 2006 04/15/2016 08:55:02 AM 41.761804 -87.595743 (41.761804069, -87.595743133)
15 16 4673644 HM187135 02/17/2006 09:26:33 PM 052XX S FAIRFIELD AVE 1811 NARCOTICS POSS: CANNABIS 30GMS OR LESS STREET True ... 14.0 63.0 18 1158926.0 1869898.0 2006 04/15/2016 08:55:02 AM 41.798728 -87.692716 (41.798728387, -87.692716037)
16 17 4673645 HM272962 04/05/2006 08:00:00 AM 024XX W HARRISON ST 0935 MOTOR VEHICLE THEFT THEFT/RECOVERY: TRUCK,BUS,MHOME STREET False ... 2.0 28.0 07 1160214.0 1897293.0 2006 04/15/2016 08:55:02 AM 41.873877 -87.687237 (41.873876903, -87.687236966)
17 18 4673646 HM263178 03/31/2006 08:20:00 AM 067XX S PERRY AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE RESIDENCE False ... 6.0 69.0 08B 1176558.0 1860293.0 2006 04/15/2016 08:55:02 AM 41.771992 -87.628345 (41.771992493, -87.628344689)
18 19 4673648 HM273363 04/05/2006 01:30:00 PM 046XX N MILWAUKEE AVE 1330 CRIMINAL TRESPASS TO LAND PARK PROPERTY True ... 45.0 15.0 26 1140652.0 1930355.0 2006 04/15/2016 08:55:02 AM 41.964984 -87.758246 (41.96498423, -87.758246066)
19 20 4673649 HM263200 03/31/2006 05:00:00 AM 062XX S RACINE AVE 0810 THEFT OVER $500 SCHOOL, PUBLIC, GROUNDS False ... 16.0 67.0 06 1169388.0 1863488.0 2006 04/15/2016 08:55:02 AM 41.780918 -87.654535 (41.780918241, -87.654535186)

20 rows × 23 columns

In [17]:
# Plot missing information 
plt.figure(figsize = (10,10))
sns.heatmap(chicago_df.isnull(), cbar=False, cmap = 'YlGnBu')
Out[17]:
<matplotlib.axes._subplots.AxesSubplot at 0x27e7b510da0>
In [18]:
# Drop some of the columns we know we will not use for this prediction
chicago_df.drop(['Unnamed: 0',
                 'ID',
                 'Case Number',
                 'IUCR',
                 'X Coordinate',
                 'Y Coordinate',
                 'Updated On',
                 'Year',
                 'FBI Code',
                 'Beat',
                 'Ward',
                 'Community Area',
                 'Location',
                 'District',
                 'Latitude',
                 'Longitude' ], inplace = True, axis = 1)
In [19]:
chicago_df.head(10)
Out[19]:
Date Block Primary Type Description Location Description Arrest Domestic
0 04/02/2006 01:00:00 PM 055XX N MANGO AVE OTHER OFFENSE HARASSMENT BY TELEPHONE RESIDENCE False False
1 02/26/2006 01:40:48 PM 065XX S RHODES AVE NARCOTICS MANU/DELIVER:CRACK SIDEWALK True False
2 01/08/2006 11:16:00 PM 013XX E 69TH ST ASSAULT AGGRAVATED: HANDGUN OTHER False False
3 04/05/2006 06:45:00 PM 061XX W NEWPORT AVE BATTERY SIMPLE RESIDENCE False False
4 02/17/2006 09:03:14 PM 037XX W 60TH ST NARCOTICS POSS: CANNABIS 30GMS OR LESS ALLEY True False
5 03/30/2006 10:30:00 PM 014XX W 73RD PL ASSAULT SIMPLE APARTMENT True False
6 04/05/2006 12:10:00 PM 050XX N LARAMIE AVE BATTERY SIMPLE SCHOOL, PUBLIC, BUILDING True False
7 04/05/2006 03:00:00 PM 067XX S ROCKWELL ST THEFT $500 AND UNDER STREET False False
8 04/05/2006 09:30:00 PM 019XX W CHICAGO AVE ASSAULT SIMPLE PARKING LOT/GARAGE(NON.RESID.) False False
9 04/03/2006 03:00:00 AM 063XX S EBERHART AVE BATTERY DOMESTIC BATTERY SIMPLE SIDEWALK False True
In [20]:
# change date time format
chicago_df.Date = pd.to_datetime(chicago_df.Date, format = '%m/%d/%Y %I:%M:%S %p')
In [21]:
chicago_df.Date
Out[21]:
0         2006-04-02 13:00:00
1         2006-02-26 13:40:48
2         2006-01-08 23:16:00
3         2006-04-05 18:45:00
4         2006-02-17 21:03:14
5         2006-03-30 22:30:00
6         2006-04-05 12:10:00
7         2006-04-05 15:00:00
8         2006-04-05 21:30:00
9         2006-04-03 03:00:00
10        2006-04-06 11:15:00
11        2006-04-04 20:15:00
12        2006-04-06 11:30:00
13        2006-02-26 14:47:21
14        2006-04-03 20:09:00
15        2006-02-17 21:26:33
16        2006-04-05 08:00:00
17        2006-03-31 08:20:00
18        2006-04-05 13:30:00
19        2006-03-31 05:00:00
20        2006-03-28 22:00:00
21        2006-02-17 21:49:21
22        2006-04-05 18:18:00
23        2006-04-06 09:45:00
24        2006-03-31 09:13:54
25        2006-04-05 22:30:00
26        2006-04-05 22:10:00
27        2006-03-31 10:00:00
28        2006-02-17 22:07:09
29        2006-04-05 17:00:00
                  ...        
1456684   2016-05-03 22:32:00
1456685   2016-05-03 22:07:00
1456686   2016-05-03 22:31:00
1456687   2016-05-03 22:45:00
1456688   2016-05-03 21:00:00
1456689   2016-05-03 19:13:00
1456690   2016-05-03 19:45:00
1456691   2016-05-03 20:56:00
1456692   2016-05-03 22:10:00
1456693   2016-05-03 22:15:00
1456694   2016-05-03 17:00:00
1456695   2016-05-03 23:58:00
1456696   2016-05-03 15:15:00
1456697   2016-05-03 23:50:00
1456698   2016-05-03 23:38:00
1456699   2016-05-03 20:44:00
1456700   2016-05-03 08:00:00
1456701   2016-05-03 22:10:00
1456702   2016-05-03 23:35:00
1456703   2016-05-03 22:15:00
1456704   2016-05-03 23:30:00
1456705   2016-05-03 23:50:00
1456706   2016-05-03 22:25:00
1456707   2016-05-03 23:00:00
1456708   2016-05-03 23:28:00
1456709   2016-05-03 23:33:00
1456710   2016-05-03 23:30:00
1456711   2016-05-03 00:15:00
1456712   2016-05-03 21:07:00
1456713   2016-05-03 23:38:00
Name: Date, Length: 6017767, dtype: datetime64[ns]
In [31]:
chicago_df.index = pd.DatetimeIndex(chicago_df.Date)
In [32]:
chicago_df.head(3)
Out[32]:
Date Block Primary Type Description Location Description Arrest Domestic
Date
2006-04-02 13:00:00 2006-04-02 13:00:00 055XX N MANGO AVE OTHER OFFENSE HARASSMENT BY TELEPHONE RESIDENCE False False
2006-02-26 13:40:48 2006-02-26 13:40:48 065XX S RHODES AVE NARCOTICS MANU/DELIVER:CRACK SIDEWALK True False
2006-01-08 23:16:00 2006-01-08 23:16:00 013XX E 69TH ST ASSAULT AGGRAVATED: HANDGUN OTHER False False
In [34]:
chicago_df['Primary Type'].value_counts()
Out[34]:
THEFT                                1245111
BATTERY                              1079178
CRIMINAL DAMAGE                       702702
NARCOTICS                             674831
BURGLARY                              369056
OTHER OFFENSE                         368169
ASSAULT                               360244
MOTOR VEHICLE THEFT                   271624
ROBBERY                               229467
DECEPTIVE PRACTICE                    225180
CRIMINAL TRESPASS                     171596
PROSTITUTION                           60735
WEAPONS VIOLATION                      60335
PUBLIC PEACE VIOLATION                 48403
OFFENSE INVOLVING CHILDREN             40260
CRIM SEXUAL ASSAULT                    22789
SEX OFFENSE                            20172
GAMBLING                               14755
INTERFERENCE WITH PUBLIC OFFICER       14009
LIQUOR LAW VIOLATION                   12129
ARSON                                   9269
HOMICIDE                                5879
KIDNAPPING                              4734
INTIMIDATION                            3324
STALKING                                2866
OBSCENITY                                422
PUBLIC INDECENCY                         134
OTHER NARCOTIC VIOLATION                 122
NON-CRIMINAL                              96
CONCEALED CARRY LICENSE VIOLATION         90
NON - CRIMINAL                            38
HUMAN TRAFFICKING                         28
RITUALISM                                 16
NON-CRIMINAL (SUBJECT SPECIFIED)           4
Name: Primary Type, dtype: int64
In [35]:
# plot top 15 
chicago_df['Primary Type'].value_counts().iloc[:15]
Out[35]:
THEFT                         1245111
BATTERY                       1079178
CRIMINAL DAMAGE                702702
NARCOTICS                      674831
BURGLARY                       369056
OTHER OFFENSE                  368169
ASSAULT                        360244
MOTOR VEHICLE THEFT            271624
ROBBERY                        229467
DECEPTIVE PRACTICE             225180
CRIMINAL TRESPASS              171596
PROSTITUTION                    60735
WEAPONS VIOLATION               60335
PUBLIC PEACE VIOLATION          48403
OFFENSE INVOLVING CHILDREN      40260
Name: Primary Type, dtype: int64
In [36]:
order_data = chicago_df['Primary Type'].value_counts().iloc[:15].index
In [37]:
# plot top 15 crime types
plt.figure(figsize = (15,10))
sns.countplot(y = 'Primary Type', data = chicago_df, order = order_data)
Out[37]:
<matplotlib.axes._subplots.AxesSubplot at 0x27e404d16d8>
In [38]:
# Plot top 15 crime locations
plt.figure(figsize = (15,10))
sns.countplot(y = 'Location Description',
              data = chicago_df,
              order = chicago_df['Location Description'].value_counts().iloc[:15].index)
Out[38]:
<matplotlib.axes._subplots.AxesSubplot at 0x27e40649748>
In [40]:
# How many crimes occured in a year
chicago_df.resample('Y').size()
Out[40]:
Date
2005-12-31    455811
2006-12-31    794684
2007-12-31    621848
2008-12-31    852053
2009-12-31    783900
2010-12-31    700691
2011-12-31    352066
2012-12-31    335670
2013-12-31    306703
2014-12-31    274527
2015-12-31    262995
2016-12-31    265462
2017-12-31     11357
Freq: A-DEC, dtype: int64
In [41]:
# Plot crimes per year
plt.plot(chicago_df.resample('Y').size())
plt.title('Crime Count Per year')
plt.xlabel('Years')
plt.ylabel('Number of Crimes')
Out[41]:
Text(0, 0.5, 'Number of Crimes')
In [42]:
# Plot crimes per Month
plt.plot(chicago_df.resample('M').size())
plt.title('Crime Count Per Month')
plt.xlabel('Month')
plt.ylabel('Number of Crimes')
Out[42]:
Text(0, 0.5, 'Number of Crimes')
In [43]:
# Plot crimes per Quarter
plt.plot(chicago_df.resample('Q').size())
plt.title('Crime Count Per Quarter')
plt.xlabel('Quater')
plt.ylabel('Number of Crimes')
Out[43]:
Text(0, 0.5, 'Number of Crimes')

Data Clean & Prep

In [45]:
# reset index
chicago_prophet = chicago_df.resample('M').size().reset_index()
In [46]:
chicago_prophet
Out[46]:
Date 0
0 2005-01-31 33983
1 2005-02-28 32042
2 2005-03-31 36970
3 2005-04-30 38963
4 2005-05-31 40572
5 2005-06-30 40234
6 2005-07-31 41976
7 2005-08-31 41741
8 2005-09-30 39833
9 2005-10-31 40204
10 2005-11-30 36244
11 2005-12-31 33049
12 2006-01-31 37605
13 2006-02-28 34063
14 2006-03-31 43721
15 2006-04-30 69128
16 2006-05-31 79013
17 2006-06-30 77348
18 2006-07-31 82750
19 2006-08-31 80628
20 2006-09-30 75045
21 2006-10-31 76870
22 2006-11-30 70710
23 2006-12-31 67803
24 2007-01-31 67123
25 2007-02-28 53811
26 2007-03-31 71857
27 2007-04-30 70389
28 2007-05-31 78170
29 2007-06-30 55802
... ... ...
115 2014-08-31 25802
116 2014-09-30 23811
117 2014-10-31 23911
118 2014-11-30 20680
119 2014-12-31 20891
120 2015-01-31 20656
121 2015-02-28 16287
122 2015-03-31 21560
123 2015-04-30 21610
124 2015-05-31 23570
125 2015-06-30 23059
126 2015-07-31 24101
127 2015-08-31 24685
128 2015-09-30 22996
129 2015-10-31 22979
130 2015-11-30 20486
131 2015-12-31 21006
132 2016-01-31 20375
133 2016-02-29 18590
134 2016-03-31 21878
135 2016-04-30 20962
136 2016-05-31 23332
137 2016-06-30 23791
138 2016-07-31 24646
139 2016-08-31 24619
140 2016-09-30 23235
141 2016-10-31 23314
142 2016-11-30 21140
143 2016-12-31 19580
144 2017-01-31 11357

145 rows × 2 columns

In [48]:
chicago_prophet.columns = ['Date', 'Crime Count']
In [49]:
chicago_prophet
Out[49]:
Date Crime Count
0 2005-01-31 33983
1 2005-02-28 32042
2 2005-03-31 36970
3 2005-04-30 38963
4 2005-05-31 40572
5 2005-06-30 40234
6 2005-07-31 41976
7 2005-08-31 41741
8 2005-09-30 39833
9 2005-10-31 40204
10 2005-11-30 36244
11 2005-12-31 33049
12 2006-01-31 37605
13 2006-02-28 34063
14 2006-03-31 43721
15 2006-04-30 69128
16 2006-05-31 79013
17 2006-06-30 77348
18 2006-07-31 82750
19 2006-08-31 80628
20 2006-09-30 75045
21 2006-10-31 76870
22 2006-11-30 70710
23 2006-12-31 67803
24 2007-01-31 67123
25 2007-02-28 53811
26 2007-03-31 71857
27 2007-04-30 70389
28 2007-05-31 78170
29 2007-06-30 55802
... ... ...
115 2014-08-31 25802
116 2014-09-30 23811
117 2014-10-31 23911
118 2014-11-30 20680
119 2014-12-31 20891
120 2015-01-31 20656
121 2015-02-28 16287
122 2015-03-31 21560
123 2015-04-30 21610
124 2015-05-31 23570
125 2015-06-30 23059
126 2015-07-31 24101
127 2015-08-31 24685
128 2015-09-30 22996
129 2015-10-31 22979
130 2015-11-30 20486
131 2015-12-31 21006
132 2016-01-31 20375
133 2016-02-29 18590
134 2016-03-31 21878
135 2016-04-30 20962
136 2016-05-31 23332
137 2016-06-30 23791
138 2016-07-31 24646
139 2016-08-31 24619
140 2016-09-30 23235
141 2016-10-31 23314
142 2016-11-30 21140
143 2016-12-31 19580
144 2017-01-31 11357

145 rows × 2 columns

In [54]:
# rename Date time to DS and prediction to y for final Prophet training

chicago_prophet_df_final = chicago_prophet.rename(columns = {'Date':'ds','Crime Count':'y'})
In [55]:
chicago_prophet_df_final
Out[55]:
ds y
0 2005-01-31 33983
1 2005-02-28 32042
2 2005-03-31 36970
3 2005-04-30 38963
4 2005-05-31 40572
5 2005-06-30 40234
6 2005-07-31 41976
7 2005-08-31 41741
8 2005-09-30 39833
9 2005-10-31 40204
10 2005-11-30 36244
11 2005-12-31 33049
12 2006-01-31 37605
13 2006-02-28 34063
14 2006-03-31 43721
15 2006-04-30 69128
16 2006-05-31 79013
17 2006-06-30 77348
18 2006-07-31 82750
19 2006-08-31 80628
20 2006-09-30 75045
21 2006-10-31 76870
22 2006-11-30 70710
23 2006-12-31 67803
24 2007-01-31 67123
25 2007-02-28 53811
26 2007-03-31 71857
27 2007-04-30 70389
28 2007-05-31 78170
29 2007-06-30 55802
... ... ...
115 2014-08-31 25802
116 2014-09-30 23811
117 2014-10-31 23911
118 2014-11-30 20680
119 2014-12-31 20891
120 2015-01-31 20656
121 2015-02-28 16287
122 2015-03-31 21560
123 2015-04-30 21610
124 2015-05-31 23570
125 2015-06-30 23059
126 2015-07-31 24101
127 2015-08-31 24685
128 2015-09-30 22996
129 2015-10-31 22979
130 2015-11-30 20486
131 2015-12-31 21006
132 2016-01-31 20375
133 2016-02-29 18590
134 2016-03-31 21878
135 2016-04-30 20962
136 2016-05-31 23332
137 2016-06-30 23791
138 2016-07-31 24646
139 2016-08-31 24619
140 2016-09-30 23235
141 2016-10-31 23314
142 2016-11-30 21140
143 2016-12-31 19580
144 2017-01-31 11357

145 rows × 2 columns

In [56]:
m = Prophet()
m.fit(chicago_prophet_df_final)
INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
C:\Users\Sage\Anaconda3\envs\cv_tf_gpu\lib\site-packages\pystan\misc.py:399: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  elif np.issubdtype(np.asarray(v).dtype, float):
Out[56]:
<fbprophet.forecaster.Prophet at 0x27e40dc95f8>
In [65]:
future =m.make_future_dataframe(periods = 720)
forecast = m.predict(future)
In [66]:
forecast
Out[66]:
ds trend yhat_lower yhat_upper trend_lower trend_upper additive_terms additive_terms_lower additive_terms_upper yearly yearly_lower yearly_upper multiplicative_terms multiplicative_terms_lower multiplicative_terms_upper yhat
0 2005-01-31 60454.773642 39871.037769 73168.474672 60454.773642 60454.773642 -4762.404217 -4762.404217 -4762.404217 -4762.404217 -4762.404217 -4762.404217 0.0 0.0 0.0 55692.369426
1 2005-02-28 60322.370911 33432.974784 67348.309861 60322.370911 60322.370911 -9500.516358 -9500.516358 -9500.516358 -9500.516358 -9500.516358 -9500.516358 0.0 0.0 0.0 50821.854553
2 2005-03-31 60175.782173 42925.248348 75456.352142 60175.782173 60175.782173 -1224.151952 -1224.151952 -1224.151952 -1224.151952 -1224.151952 -1224.151952 0.0 0.0 0.0 58951.630221
3 2005-04-30 60033.922104 43878.152251 78470.506727 60033.922104 60033.922104 1182.829000 1182.829000 1182.829000 1182.829000 1182.829000 1182.829000 0.0 0.0 0.0 61216.751104
4 2005-05-31 59887.333366 48818.368558 82276.165116 59887.333366 59887.333366 5498.247964 5498.247964 5498.247964 5498.247964 5498.247964 5498.247964 0.0 0.0 0.0 65385.581330
5 2005-06-30 59745.473296 47217.517953 80507.203201 59745.473296 59745.473296 3576.966082 3576.966082 3576.966082 3576.966082 3576.966082 3576.966082 0.0 0.0 0.0 63322.439378
6 2005-07-31 59598.884555 48384.488293 80312.801852 59598.884555 59598.884555 4582.849351 4582.849351 4582.849351 4582.849351 4582.849351 4582.849351 0.0 0.0 0.0 64181.733907
7 2005-08-31 59452.295814 47312.131726 80439.962623 59452.295814 59452.295814 4498.965423 4498.965423 4498.965423 4498.965423 4498.965423 4498.965423 0.0 0.0 0.0 63951.261237
8 2005-09-30 59310.435742 43437.122285 78648.909936 59310.435742 59310.435742 1749.360219 1749.360219 1749.360219 1749.360219 1749.360219 1749.360219 0.0 0.0 0.0 61059.795961
9 2005-10-31 59163.847001 44990.963312 78964.317338 59163.847001 59163.847001 2397.444549 2397.444549 2397.444549 2397.444549 2397.444549 2397.444549 0.0 0.0 0.0 61561.291550
10 2005-11-30 59021.986929 41043.550632 74137.483958 59021.986929 59021.986929 -2064.694573 -2064.694573 -2064.694573 -2064.694573 -2064.694573 -2064.694573 0.0 0.0 0.0 56957.292356
11 2005-12-31 58875.398188 36618.109763 69612.967208 58875.398188 58875.398188 -5991.605511 -5991.605511 -5991.605511 -5991.605511 -5991.605511 -5991.605511 0.0 0.0 0.0 52883.792677
12 2006-01-31 58728.809447 37558.565748 70098.306521 58728.809447 58728.809447 -4772.140541 -4772.140541 -4772.140541 -4772.140541 -4772.140541 -4772.140541 0.0 0.0 0.0 53956.668907
13 2006-02-28 58596.406714 32375.683588 66092.222725 58596.406714 58596.406714 -9502.632319 -9502.632319 -9502.632319 -9502.632319 -9502.632319 -9502.632319 0.0 0.0 0.0 49093.774395
14 2006-03-31 58449.817973 41885.370746 74957.489775 58449.817973 58449.817973 -1224.293758 -1224.293758 -1224.293758 -1224.293758 -1224.293758 -1224.293758 0.0 0.0 0.0 57225.524214
15 2006-04-30 58307.957895 44483.727641 76003.042115 58307.957895 58307.957895 1186.957830 1186.957830 1186.957830 1186.957830 1186.957830 1186.957830 0.0 0.0 0.0 59494.915725
16 2006-05-31 58161.369149 47149.128630 80992.830563 58161.369149 58161.369149 5451.047069 5451.047069 5451.047069 5451.047069 5451.047069 5451.047069 0.0 0.0 0.0 63612.416218
17 2006-06-30 58019.509071 46008.628080 77488.141247 58019.509071 58019.509071 3563.602666 3563.602666 3563.602666 3563.602666 3563.602666 3563.602666 0.0 0.0 0.0 61583.111737
18 2006-07-31 57872.920325 46608.693732 79222.367366 57872.920325 57872.920325 4562.735058 4562.735058 4562.735058 4562.735058 4562.735058 4562.735058 0.0 0.0 0.0 62435.655383
19 2006-08-31 57726.331578 45019.959704 78589.791249 57726.331578 57726.331578 4479.578436 4479.578436 4479.578436 4479.578436 4479.578436 4479.578436 0.0 0.0 0.0 62205.910014
20 2006-09-30 57584.471501 43179.019023 76450.527257 57584.471501 57584.471501 1829.654501 1829.654501 1829.654501 1829.654501 1829.654501 1829.654501 0.0 0.0 0.0 59414.126002
21 2006-10-31 57437.882755 44081.666981 76434.581556 57437.882755 57437.882755 2439.928848 2439.928848 2439.928848 2439.928848 2439.928848 2439.928848 0.0 0.0 0.0 59877.811603
22 2006-11-30 57296.022677 38458.984639 71738.537293 57296.022677 57296.022677 -2045.027660 -2045.027660 -2045.027660 -2045.027660 -2045.027660 -2045.027660 0.0 0.0 0.0 55250.995017
23 2006-12-31 57149.433931 34875.671158 69101.588819 57149.433931 57149.433931 -6012.909961 -6012.909961 -6012.909961 -6012.909961 -6012.909961 -6012.909961 0.0 0.0 0.0 51136.523970
24 2007-01-31 56994.736733 36179.642210 69663.507586 56994.736733 56994.736733 -4782.491825 -4782.491825 -4782.491825 -4782.491825 -4782.491825 -4782.491825 0.0 0.0 0.0 52212.244908
25 2007-02-28 56855.010232 31346.784742 64065.254522 56855.010232 56855.010232 -9501.516526 -9501.516526 -9501.516526 -9501.516526 -9501.516526 -9501.516526 0.0 0.0 0.0 47353.493707
26 2007-03-31 56700.313035 39749.027009 71869.017210 56700.313035 56700.313035 -1225.130705 -1225.130705 -1225.130705 -1225.130705 -1225.130705 -1225.130705 0.0 0.0 0.0 55475.182330
27 2007-04-30 56550.606070 39542.377373 73068.382613 56550.606070 56550.606070 1190.085128 1190.085128 1190.085128 1190.085128 1190.085128 1190.085128 0.0 0.0 0.0 57740.691197
28 2007-05-31 56395.908872 44368.641412 78465.519686 56395.908872 56395.908872 5401.847116 5401.847116 5401.847116 5401.847116 5401.847116 5401.847116 0.0 0.0 0.0 61797.755988
29 2007-06-30 56230.874395 44173.760558 77348.685514 56230.874395 56230.874395 3550.921888 3550.921888 3550.921888 3550.921888 3550.921888 3550.921888 0.0 0.0 0.0 59781.796283
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
835 2018-12-23 5792.144402 -17303.863375 16264.162698 5388.264834 6187.726208 -6250.192153 -6250.192153 -6250.192153 -6250.192153 -6250.192153 -6250.192153 0.0 0.0 0.0 -458.047751
836 2018-12-24 5779.077729 -17372.197730 15602.303531 5373.836280 6175.842870 -6290.109081 -6290.109081 -6290.109081 -6290.109081 -6290.109081 -6290.109081 0.0 0.0 0.0 -511.031352
837 2018-12-25 5766.011055 -17093.126515 16387.761538 5359.407727 6163.959533 -6305.807634 -6305.807634 -6305.807634 -6305.807634 -6305.807634 -6305.807634 0.0 0.0 0.0 -539.796579
838 2018-12-26 5752.944381 -16237.502702 15730.051883 5344.979174 6152.076195 -6298.957364 -6298.957364 -6298.957364 -6298.957364 -6298.957364 -6298.957364 0.0 0.0 0.0 -546.012982
839 2018-12-27 5739.877708 -17445.613748 16254.224774 5330.550621 6140.192857 -6271.692465 -6271.692465 -6271.692465 -6271.692465 -6271.692465 -6271.692465 0.0 0.0 0.0 -531.814757
840 2018-12-28 5726.811034 -17558.238469 15816.503972 5316.122067 6128.309519 -6226.535740 -6226.535740 -6226.535740 -6226.535740 -6226.535740 -6226.535740 0.0 0.0 0.0 -499.724706
841 2018-12-29 5713.744361 -17312.586246 16316.338094 5301.693514 6116.426182 -6166.312273 -6166.312273 -6166.312273 -6166.312273 -6166.312273 -6166.312273 0.0 0.0 0.0 -452.567912
842 2018-12-30 5700.677687 -17385.668014 17418.411359 5287.264961 6104.542844 -6094.055674 -6094.055674 -6094.055674 -6094.055674 -6094.055674 -6094.055674 0.0 0.0 0.0 -393.377987
843 2018-12-31 5687.611014 -16358.171656 16214.789681 5272.844633 6092.597135 -6012.909961 -6012.909961 -6012.909961 -6012.909961 -6012.909961 -6012.909961 0.0 0.0 0.0 -325.298947
844 2019-01-01 5674.544340 -17063.004192 15968.201193 5258.431485 6080.725101 -5926.030254 -5926.030254 -5926.030254 -5926.030254 -5926.030254 -5926.030254 0.0 0.0 0.0 -251.485914
845 2019-01-02 5661.477666 -15975.646019 16621.566665 5244.018337 6068.879860 -5836.485485 -5836.485485 -5836.485485 -5836.485485 -5836.485485 -5836.485485 0.0 0.0 0.0 -175.007818
846 2019-01-03 5648.410993 -16432.857422 17578.453280 5229.725140 6057.241455 -5747.166204 -5747.166204 -5747.166204 -5747.166204 -5747.166204 -5747.166204 0.0 0.0 0.0 -98.755212
847 2019-01-04 5635.344319 -17299.232405 16661.965163 5215.605329 6045.125933 -5660.700423 -5660.700423 -5660.700423 -5660.700423 -5660.700423 -5660.700423 0.0 0.0 0.0 -25.356104
848 2019-01-05 5622.277646 -16398.239359 15549.706351 5201.674499 6033.063234 -5579.380115 -5579.380115 -5579.380115 -5579.380115 -5579.380115 -5579.380115 0.0 0.0 0.0 42.897530
849 2019-01-06 5609.210972 -17289.434998 16441.363976 5187.740855 6020.894531 -5505.100675 -5505.100675 -5505.100675 -5505.100675 -5505.100675 -5505.100675 0.0 0.0 0.0 104.110297
850 2019-01-07 5596.144299 -15884.687110 16785.291645 5173.819232 6008.687435 -5439.315182 -5439.315182 -5439.315182 -5439.315182 -5439.315182 -5439.315182 0.0 0.0 0.0 156.829116
851 2019-01-08 5583.077625 -15539.457809 17315.432554 5159.909516 5996.521446 -5383.004841 -5383.004841 -5383.004841 -5383.004841 -5383.004841 -5383.004841 0.0 0.0 0.0 200.072784
852 2019-01-09 5570.010951 -16908.654346 15989.430712 5145.999800 5984.386157 -5336.666448 -5336.666448 -5336.666448 -5336.666448 -5336.666448 -5336.666448 0.0 0.0 0.0 233.344504
853 2019-01-10 5556.944278 -17032.803491 17387.138959 5132.104015 5972.216699 -5300.317162 -5300.317162 -5300.317162 -5300.317162 -5300.317162 -5300.317162 0.0 0.0 0.0 256.627116
854 2019-01-11 5543.877604 -15941.897191 16690.545634 5118.225536 5960.047241 -5273.516318 -5273.516318 -5273.516318 -5273.516318 -5273.516318 -5273.516318 0.0 0.0 0.0 270.361287
855 2019-01-12 5530.810931 -16129.246061 16127.193538 5104.347058 5947.899870 -5255.403446 -5255.403446 -5255.403446 -5255.403446 -5255.403446 -5255.403446 0.0 0.0 0.0 275.407485
856 2019-01-13 5517.744257 -16844.840368 18350.931741 5090.468579 5936.465582 -5244.751151 -5244.751151 -5244.751151 -5244.751151 -5244.751151 -5244.751151 0.0 0.0 0.0 272.993106
857 2019-01-14 5504.677583 -17052.724654 17184.432672 5076.590101 5925.031294 -5240.031007 -5240.031007 -5240.031007 -5240.031007 -5240.031007 -5240.031007 0.0 0.0 0.0 264.646577
858 2019-01-15 5491.610910 -16728.281653 17345.038757 5062.740775 5913.414172 -5239.490200 -5239.490200 -5239.490200 -5239.490200 -5239.490200 -5239.490200 0.0 0.0 0.0 252.120710
859 2019-01-16 5478.544236 -17625.845855 16597.840195 5049.175913 5901.489625 -5241.236301 -5241.236301 -5241.236301 -5241.236301 -5241.236301 -5241.236301 0.0 0.0 0.0 237.307935
860 2019-01-17 5465.477563 -16691.866686 18407.723800 5035.931350 5889.555559 -5243.327260 -5243.327260 -5243.327260 -5243.327260 -5243.327260 -5243.327260 0.0 0.0 0.0 222.150303
861 2019-01-18 5452.410889 -17165.028670 16859.792348 5022.686787 5877.581118 -5243.863550 -5243.863550 -5243.863550 -5243.863550 -5243.863550 -5243.863550 0.0 0.0 0.0 208.547339
862 2019-01-19 5439.344216 -15907.331378 15961.632388 5009.442224 5865.601582 -5241.079287 -5241.079287 -5241.079287 -5241.079287 -5241.079287 -5241.079287 0.0 0.0 0.0 198.264929
863 2019-01-20 5426.277542 -17285.050014 16129.531154 4995.277181 5853.606555 -5233.429161 -5233.429161 -5233.429161 -5233.429161 -5233.429161 -5233.429161 0.0 0.0 0.0 192.848381
864 2019-01-21 5413.210868 -16648.349053 17594.094160 4980.871201 5841.611528 -5219.668159 -5219.668159 -5219.668159 -5219.668159 -5219.668159 -5219.668159 0.0 0.0 0.0 193.542710

865 rows × 16 columns

In [67]:
figure = m.plot(forecast, xlabel = 'Date', ylabel = 'Crime Rate')
In [68]:
figure = m.plot_components(forecast)