Skip to content

Commit

Permalink
[pre-commit.ci] auto fixes from pre-commit.com hooks
Browse files Browse the repository at this point in the history
for more information, see https://pre-commit.ci
  • Loading branch information
pre-commit-ci[bot] committed Apr 11, 2022
1 parent 6c7f22e commit e236030
Show file tree
Hide file tree
Showing 11 changed files with 104 additions and 111 deletions.
1 change: 0 additions & 1 deletion app.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,3 @@
}
}
}

6 changes: 2 additions & 4 deletions datasets/baseballdb/README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,11 @@ Chadwick Baseball Bureau (http://www.chadwick-bureau.com),
from its Register of baseball personnel.

Player performance data for 1871 through 2014 is based on the
Lahman Baseball Database, version 2015-01-24, which is
Lahman Baseball Database, version 2015-01-24, which is
Copyright (C) 1996-2015 by Sean Lahman.

The tables Parks.csv and HomeGames.csv are based on the game logs
and park code table published by Retrosheet.
This information is available free of charge from and is copyrighted
by Retrosheet. Interested parties may contact Retrosheet at
by Retrosheet. Interested parties may contact Retrosheet at
http://www.retrosheet.org.


2 changes: 1 addition & 1 deletion datasets/baseballdb/core/AwardsManagers.csv
Original file line number Diff line number Diff line change
Expand Up @@ -176,5 +176,5 @@ showabu99,BBWAA Manager of the Year,2014,AL,,
willima04,BBWAA Manager of the Year,2014,NL,,
banisje01,BBWAA Manager of the Year,2015,AL,,
maddojo99,BBWAA Manager of the Year,2015,NL,,
francte01,BBWAA Manager of the Year,2016,AL,,
francte01,BBWAA Manager of the Year,2016,AL,,
roberda07,BBWAA Manager of the Year,2016,NL,,
129 changes: 64 additions & 65 deletions datasets/baseballdb/core/readme2014.txt

Large diffs are not rendered by default.

40 changes: 20 additions & 20 deletions datasets/bikes/Readme.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@ Rua Dr. Roberto Frias, 378


=========================================
Background
Background
=========================================

Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return
back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return
back at another position. Currently, there are about over 500 bike-sharing programs around the world which is composed of
over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic,
environmental and health issues.
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return
back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return
back at another position. Currently, there are about over 500 bike-sharing programs around the world which is composed of
over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic,
environmental and health issues.

Apart from interesting real world applications of bike sharing systems, the characteristics of data being generated by
these systems make them attractive for the research. Opposed to other transport services such as bus or subway, the duration
Expand All @@ -30,21 +30,21 @@ events in the city could be detected via monitoring these data.
Data Set
=========================================
Bike-sharing rental process is highly correlated to the environmental and seasonal settings. For instance, weather conditions,
precipitation, day of week, season, hour of the day, etc. can affect the rental behaviors. The core data set is related to
the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA which is
publicly available in http://capitalbikeshare.com/system-data. We aggregated the data on two hourly and daily basis and then
extracted and added the corresponding weather and seasonal information. Weather information are extracted from http://www.freemeteo.com.
precipitation, day of week, season, hour of the day, etc. can affect the rental behaviors. The core data set is related to
the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA which is
publicly available in http://capitalbikeshare.com/system-data. We aggregated the data on two hourly and daily basis and then
extracted and added the corresponding weather and seasonal information. Weather information are extracted from http://www.freemeteo.com.

=========================================
Associated tasks
=========================================

- Regression:
- Regression:
Predication of bike rental count hourly or daily based on the environmental and seasonal settings.
- Event and Anomaly Detection:

- Event and Anomaly Detection:
Count of rented bikes are also correlated to some events in the town which easily are traceable via search engines.
For instance, query like "2012-10-30 washington d.c." in Google returns related results to Hurricane Sandy. Some of the important events are
For instance, query like "2012-10-30 washington d.c." in Google returns related results to Hurricane Sandy. Some of the important events are
identified in [1]. Therefore the data can be used for validation of anomaly or event detection algorithms as well.


Expand All @@ -56,12 +56,12 @@ Files
- hour.csv : bike sharing counts aggregated on hourly basis. Records: 17379 hours
- day.csv - bike sharing counts aggregated on daily basis. Records: 731 days


=========================================
Dataset characteristics
=========================================
=========================================
Both hour.csv and day.csv have the following fields, except hr which is not available in day.csv

- instant: record index
- dteday : date
- season : season (1:springer, 2:summer, 3:fall, 4:winter)
Expand All @@ -71,7 +71,7 @@ Both hour.csv and day.csv have the following fields, except hr which is not avai
- holiday : weather day is holiday or not (extracted from http://dchr.dc.gov/page/holiday-schedule)
- weekday : day of the week
- workingday : if day is neither weekend nor holiday is 1, otherwise is 0.
+ weathersit :
+ weathersit :
- 1: Clear, Few clouds, Partly cloudy, Partly cloudy
- 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
- 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
Expand All @@ -83,7 +83,7 @@ Both hour.csv and day.csv have the following fields, except hr which is not avai
- casual: count of casual users
- registered: count of registered users
- cnt: count of total rental bikes including both casual and registered

=========================================
License
=========================================
Expand All @@ -107,5 +107,5 @@ Use of this dataset in publications must be cited to the following publication:
=========================================
Contact
=========================================

For further information about this dataset please contact Hadi Fanaee-T ([email protected])
2 changes: 1 addition & 1 deletion datasets/biofilm.csv
Original file line number Diff line number Diff line change
Expand Up @@ -94,4 +94,4 @@ experiment,isolate,ST,OD600,measurement,replicate,normalized_measurement
2,13,4,0.522,1.316,3,2.521072797
2,14,4,0.576,1.959,3,3.401041667
2,15,4,0.427,1.073,3,2.512880562
2,ATCC_29212,30,0.688,1.122,3,1.630813953
2,ATCC_29212,30,0.688,1.122,3,1.630813953
2 changes: 1 addition & 1 deletion datasets/mlb_2013-2016.csv
Original file line number Diff line number Diff line change
Expand Up @@ -118,4 +118,4 @@ Season,Team,Team Salary,Team Salary (in millions),League,Wins,Losses,Winning %,A
2013,Tampa Bay Rays,57030272,57,AL,92,71,0.564,5538,700,1421,296,23,165,670,589,1171,73,38,0.257,0.329,0.408,0.737,3.74,42,60,1464,1315,646,608,153,482,1310,0.24,1.23,13176,6044,4392,1593,59,147,0.783,69,0.99,0.708
2013,Texas Rangers,127197575,127.2,AL,91,72,0.558,5585,730,1465,262,23,176,691,462,1067,149,46,0.262,0.323,0.412,0.735,3.62,46,57,1463.1,1370,636,589,157,498,1309,0.248,1.28,13170,6025,4390,1549,86,146,0.682,68,0.986,0.695
2013,Toronto Blue Jays,118244039,118.2,AL,74,88,0.457,5537,712,1398,273,24,185,669,510,1123,112,41,0.252,0.318,0.411,0.729,4.25,39,58,1452,1451,756,685,195,500,1208,0.259,1.34,13068,6072,4356,1605,111,145,0.75,54,0.982,0.691
2013,Washington Nationals,112431770,112.4,NL,86,76,0.531,5436,656,1365,259,27,161,621,464,1192,88,28,0.251,0.313,0.398,0.71,3.59,47,68,1445.2,1367,626,576,142,405,1236,0.249,1.23,13011,5993,4337,1549,107,146,0.826,43,0.982,0.691
2013,Washington Nationals,112431770,112.4,NL,86,76,0.531,5436,656,1365,259,27,161,621,464,1192,88,28,0.251,0.313,0.398,0.71,3.59,47,68,1445.2,1367,626,576,142,405,1236,0.249,1.23,13011,5993,4337,1549,107,146,0.826,43,0.982,0.691
2 changes: 1 addition & 1 deletion datasets/ship-damage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ type,yr_construction,period_op,months,n_damages
5,2,2,437,7
5,3,1,1157,5
5,3,2,2161,12
5,4,2,542,1
5,4,2,542,1
2 changes: 1 addition & 1 deletion models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,4 +115,4 @@ def plot_elbo(self):
plt.plot(-self.advi_hist)
plt.ylabel('ELBO')
plt.xlabel('iteration')
sns.despine()
sns.despine()
28 changes: 13 additions & 15 deletions models/feedforward.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,33 +6,33 @@


class ForestCoverModel(BayesianModel):

def __init__(self, n_hidden):
super(ForestCoverModel, self).__init__()
self.n_hidden = n_hidden

def create_model(self, X=None, y=None):
if X:
num_samples, self.num_pred = X.shape

if y:
num_samples, self.num_out = Y.shape

model_input = theano.shared(np.zeros(shape=(1, self.num_pred)))
model_output = theano.shared(np.zeros(shape=(1,self.num_out)))

self.shared_vars = {
'model_input': model_input,
'model_output': model_output
}

with pm.Model() as model:
# Define weights
weights_1 = pm.Normal('w_1', mu=0, sd=1,
weights_1 = pm.Normal('w_1', mu=0, sd=1,
shape=(self.num_pred, self.n_hidden))
weights_2 = pm.Normal('w_2', mu=0, sd=1,
shape=(self.n_hidden, self.n_hidden))
weights_out = pm.Normal('w_out', mu=0, sd=1,
weights_out = pm.Normal('w_out', mu=0, sd=1,
shape=(self.n_hidden, self.num_outs))

# Define activations
Expand All @@ -41,29 +41,27 @@ def create_model(self, X=None, y=None):
acts_out = tt.nnet.softmax(tt.dot(acts_2, weights_out)) # noqa

# Define likelihood
out = pm.Multinomial('likelihood', n=1, p=acts_out,
out = pm.Multinomial('likelihood', n=1, p=acts_out,
observed=model_output)

return model


def fit(self, X, y, n=200000, batch_size=10):
"""
Train the Bayesian NN model.
"""
num_samples, self.num_pred = X.shape
_, self.num_out = y.shape

if self.cached_model is None:
self.cached_model = self.create_model()

with self.cached_model:
minibatches = {
self.shared_vars['model_input']: pm.Minibatch(X, batch_size=batch_size),
self.shared_vars['model_output']: pm.Minibatch(y, batch_size=batch_size),
}
self._inference(minibatches, n)

return self


1 change: 0 additions & 1 deletion test_gpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,3 @@
print('Used the cpu')
else:
print('Used the gpu')

0 comments on commit e236030

Please sign in to comment.